Skip to main content

Full text of "Effect of reducing cognitive complexity on a hypothetico-deductive reasoning task"

See other formats


THE EFFECT OF REDUCING COGNITIVE COMPLEXITY ON A 
HYPOTHETICO-DEDUCTIVE REASONING TASK 



By 

J. PAMELA MAREK-LOVEJOY 



A DISSERTATION TO BE PRESENTED TO THE GRADUATE SCHOOL 

OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT 

OF THE REQUIREMENTS FOR THE DEGREE OF 

DOCTOR OF PHILOSOPHY 

UNIVERSITY OF FLORIDA 

1998 



ACKNOWLEDGMENTS 
I wish to express my enduring gratitude to my mentor, Richard A. Griggs, who 
has served as an inspiration and a motivating force throughout this investigation, for his 
ongoing concern and tenacious commitment. I also wish to thank the members of my 
committee, Shari Ellis, Ira Fischler, Patricia H. Miller, and Chris Janiszewski, for their 
continued support. 

I am also indebted to my colleague, Andrew Christopher, who has been a constant 
source of encouragement. My deepest appreciation goes to my husband, Leon Lovejoy, 
for his emotional and material support during the time this dissertation was being 
prepared. 



11 



TABLE OF CONTENTS 

page 

ACKNOWLEDGMENTS ii 

ABSTRACT iv 

INTRODUCTION 1 

LITERATURE REVIEW 4 

What is the Meaning of "Or"? 4 

Introducing the THOG 8 

Early Investigations 9 

Introducing Realism 17 

The Era of Separation 26 

Providing Procedural Cues 38 

Summary 43 

DISSERTATION FRAMEWORK AND PLAN 46 

METHODS, MATERIALS AND RESULTS 56 

General Procedures 56 

Experiments la and lb 56 

Experiments 2a and 2b 62 

Experiment 3 66 

Experiments 4a and 4b 68 

GENERAL DISCUSSION 83 

CONCLUSIONS AND FUTURE DIRECTIONS 95 

REFERENCES 99 

APPENDIX A WORDING OF PROBLEMS 103 

APPENDIX B MATERIALS 1 12 

BIOGRAPHICAL SKETCH 137 



in 



Abstract of Dissertation Presented to the Graduate School 

of the University of Florida in Partial Fulfillment of the 

Requirements for the Degree of Doctor of Philosophy 

By 

J. Pamela Marek-Lovejoy 

August 1998 

Chairman: Richard A. Griggs 
Major Department: Psychology 

In adaptive reasoning, prior knowledge and selective attention may promote 

efficiency by reducing information processing demands. However, in logical tasks, our 

knowledge base and focusing strategies may lead us astray. The THOG problem is one of 

a trilogy of tasks designed by Peter Wason to unveil systematic nonlogical tendencies that 

divert people from the appropriate solution path. Based on an exclusive disjunctive rule, 

the correct solution to the THOG problem involves hypothesis generation, classification 

of four designs given alternative hypotheses, and simultaneous evaluation of information 

from multiple hypotheses. Typically no more than one third of adult participants solve the 

standard version of the problem. Confusion theory suggests that the cognitive complexity 

of the problem leads people to adopt simplistic strategies such as focusing on properties 

of the positive example. Clearly separating the hypothesized properties from those of the 

positive example ameliorates this difficulty in some instances, but facilitation in an 

abstract context is closely linked to the intricacies of instruction wording. This research 

explicates difficulties encountered in the THOG problem by examining the effects of 



IV 



restricting the number of alternative classifications and providing procedural assistance. 
In seven experiments, 603 University of Florida undergraduates each completed one 
pencil-and-paper version of the problem. Eliminating an "indeterminate" option from the 
set of three possible classifications provided facilitation, as did wording the instruction to 
focus attention on one other THOG in addition to the positive example. This facilitation 
could, however, reflect nonlogical strategies such as focusing on uniqueness. A complete 
explanation of the problem to the point of simultaneous hypothesis testing had a null 
effect. Results strongly suggest that within the confines of the standard abstract problem 
structure, consistent facilitation of logical reasoning is ultimately hampered by failure to 
accurately combine information from multiple hypotheses. Uncertainty regarding which 
of two hypotheses represents reality may influence people to bypass simultaneous 
hypothesis testing. Given this requirement in the THOG problem, its complexity may 
become an insurmountable challenge. Because problem demands simulate those in more 
realistic decision-making scenarios involving several alternatives and constraints, 
development of techniques to promote simultaneous evaluation of multiple hypotheses is 
recommended. 



INTRODUCTION 
Reasoning, the basis for development and testing of hypotheses and formulation 
of logical conclusions, requires filtering information to select relevant data for 
goal-directed decisions. Prior knowledge and selective attention may promote efficiency 
in reasoning by reducing information processing demands but may also inappropriately 
override logic. "To capture some of the same processes, the puzzlement, the doubts, the 
obsessive tendencies toward repetition, the compelling power of false clues... "(Wason, 
1978, p. 20) manifested in reasoning beyond the laboratory, Peter Wason devised a 
trilogy of experimental tasks: the 2-4-6 problem, the selection task, and the THOG 
problem. These problems illuminated weaknesses in reasoning, including resistance to 
falsification, a tendency to be misled by perceptual cues, and self-contradiction. The high 
error rates contradicted prevailing ideas concerning our abilities to reason logically 
(Evans & Newstead, 1995). 

Initial research on the 2-4-6 task, an inductive reasoning problem designed to 
explore the strategies people use to generate and test hypotheses, was published in 1960. 
In this task, participants aim to determine the rule used to create a three-number 
sequence. They do so by generating a series of three numbers, then receiving feedback 
indicating whether the series fits the rule. When participants are confident they know the 
rule, they report it to the experimenter. If they are incorrect, they resume generating 
triples. Although effective strategies involve successively testing and attempting to falsify 






different hypotheses, many participants tend to repeatedly test positive examples 
consistent with their original hypothesis (Tweney et al., 1980). 

The first major paper on the selection task, a deductive reasoning problem 
involving a conditional rule, was published in 1966. In this task, participants are shown 
one side of four cards (e.g., A, K, 4, 7), then indicate which cards they would turn over to 
determine the truth or falsity of a rule (e.g., "If a card has a vowel on one side, then it has 
an odd number on the other"). Typically, no more than 10% of adult participants make 
the logically correct choices (Evans & Newstead, 1995). The fame accorded the selection 
task, more widely researched than any other reasoning problem, was a primary impetus 
for the development of the THOG problem ten years later. According to Wason (1977), in 
the course of a decade, potential participants had become too familiar with the selection 
task, although they often remained confused about its solution even after exposure to 
relevant information (Wason, 1979; 1981). 

To delve further into contradictory elements in reasoning leading to a "crisis in 
belief," Wason created the THOG problem, a task requiring the generation of hypotheses 
and application of an exclusive disjunctive rule. For the past 20 years, researchers have 
been searching for reasons to explicate the difficulty of the THOG problem. This paper 
reviews the highlights of that search, emphasizing attempts to reduce its cognitive 
complexity within the confines of its standard structure. Facilitation efforts are designed 
to overcome identified biases that impede solution. 

As defined by Evans (1989), "bias" refers to "a systematic tendency to take 
account of factors irrelevant to the task at hand or to ignore relevant factors" (p. 9). The 
term does not necessarily imply that people are unable to reason logically. Rather, it 



suggests that specific problem features provoke use of inappropriate strategies that thwart 
solution. Knowledge of such bias facilitates understanding of the cognitive processes that 
underlie reasoning performance. 

To provide a framework for positioning subsequent experiments on the THOG 
problem and its relations, Section 1 of the Literature Review encapsulates related research 
on how people interpret disjunctive statements. Section 2 previews the standard version 
of the THOG, introducing the cognitive activities presumed to underlie its solution. 
Section 3 describes early studies of the THOG, including attempts at facilitation through 
the use of realistic material. Section 4 discusses efforts to reduce cognitive complexity by 
clearly separating the properties of the designated example from the hypothesized 
properties to which the disjunctive rule is applied during classification. Section 5 outlines 
attempts to guide people toward the correct solution by providing procedural cues. 
Section 6 summarizes information presented in preceding sections. 



LITERATURE REVIEW 
What is the Meaning of "Or?" 
Consider the two propositions (p and q) in the following sentence: "Jake is angry 
(/?) or Jason is friendly (q)." Under what conditions is this sentence logically false? If both 
propositions (disjuncts) are false (if Jake is not angry and Jason is not friendly), then the 
disjunction,/? or q, is false as well. Under what conditions is this sentence logically true? 
If/? is true and q is false, then the sentence itself is true. Similarly, if/? is false and q is 
true, then the sentence itself is true. These three determinations of truth and falsity hold 
regardless of the interpretation of "or." The fourth possible combination is not as clear 
cut. Suppose both disjuncts are true: Jake is angry and Jason is friendly. Given an 
inclusive interpretation, the original sentence is true because inclusive disjunction permits 
both propositions to occur. In contrast, given an exclusive interpretation, the original 
sentence is false because exclusive disjunction, by definition, specifies that/? and q 
cannot both occur without falsifying a statement or violating a rule. Inclusive disjunction 
allows, but does not require, an "or both" interpretation, whereas an exclusive disjunction 
is limited to "but not both" scenarios. 

In propositional logic, the relationship between the truth and falsity of each of the 
propositions and the truth and falsity of the statement that they comprise is typically 
illustrated via the use of truth tables. For example, the truth table that follows (in which 






T - True and F = False) highlights how the difference between inclusive and exclusive 
disjunction occurs only in the case when both propositions p and q are true. 



Connective 



Inclusive disjunction 



Truth 
Linguistic Form Proposition of Rule 



p or q (or both) 



e a 

T T 
T F 



Exclusive disjunction 



p or q (but not both) T T 



FT T 

F F F 

In linguistics, theorists have debated which of these two interpretations of 
disjunctive statements is "basic." Newstead and Griggs (1983) cited key positions in this 
debate. They noted that Gazdar (1979) argued for the primacy of the inclusive 
interpretation but considered that "or" may take on an exclusive meaning when used 
without qualification. This derived interpretation stems from the expectation that speakers 
provide the maximum amount of information. Lakoff (1971) advocated that "or" implies 
an alternative, and that most disjunctive sentences are congruent with an exclusive 
interpretation. Hurford (1974) claimed that "or" has dual meanings, with other languages 
assigning different words to each meaning. Evidence for the various positions in the 



6 

linguistic debate included theorists' interpretations of different sentences and their 
meanings, rather than experimental data. 

In the psychology of reasoning, Evans and Newstead (1980) experimentally 
investigated people's perceptions of disjunctive statements involving letters and numbers. 
In Experiment 1 , participants read a rule concerning how certain letters could be paired 
with certain digits, then generated letter-number pairs that conformed to or contradicted 
the rule. In Experiment 2, participants indicated whether a rule was true or false in 
relation to a specific letter-number pair. In both experiments, people who responded to 
TT instances showed a preference for inclusive interpretation. Approximately 50% to 
60% of the participants consistently favored an inclusive interpretation of these instances, 
about 20% consistently favored an exclusive interpretation, and the remainder gave 
inconsistent responses. However, Newstead and Griggs (1983) pointed out that focusing 
primarily on TT instances provides a less rigid criterion than reconstruction of the entire 
subjective truth tables used by participants. Using this approach, other studies (e.g., 
Braine and Rumain, 1981) have shown that exclusive disjunction is the primary 
interpretation, at least if the material involved is abstract or only weakly linked to 
real-world usage. 

Using more realistic scenarios, subsequent research suggested that interpretation 
of disjunctives is context-dependent. Newstead, Griggs, and Chrostowski (1984, Exp. 1) 
presented participants with brief passages, each of which included a disjunctive 
statement. Participants read three passages from each of seven different contexts (e.g., 
threat, choice, qualification). For each passage, they indicated whether each of four 
possible outcomes (including each of the four possible disjunct pairs: TT, TF, FT, FF) 









was consistent or inconsistent with the disjunctive statement. Although there was 
considerable variation between contexts, a majority of responses (averages of 65% and 
76% in two studies) were indicative of an exclusive interpretation. This was generally 
true of all contexts except qualification, in which the inclusive interpretation 
predominated. In qualification contexts (e.g., "The person I will vote for will have to be 
either intelligent or open-minded"), it is generally understood that the presence of both 
disjuncts most likely enhances a candidate's potential for gaining a vote rather than 
reducing it. 

In Experiment 2, participants read brief passages ending with a disjunctive 
statement, then read a sentence that either affirmed or denied the first disjunct. Their task 
was to indicate whether a conclusion that either affirmed or denied the second disjunct 
followed from the information given in the disjunctive statement and the sentence that 
followed it. Again, responses were typically indicative of an exclusive interpretation, 
except in a qualification context. 

Of the seven contexts studied, the abstract context is most directly relevant to 
interpretation of the exclusive disjunctive rule in the standard THOG problem. In fact, the 
disjuncts involved in one of the abstract scenarios included shape and color. Overall, in 
the abstract context, participants who evaluated consistency of outcomes (Newstead et al., 
1984, Exp. 1) favored an exclusive rather than inclusive interpretation (67% vs. 18% for 
the main experiment and 47% vs. 45% for the replication). Participants who evaluated 
conclusions based on a disjunctive statement followed by an affirmation of its first 
disjunct also favored an exclusive interpretation of the connective "or" (Newstead et al., 
1984, Exp. 2). However, the tendency to interpret a disjunctive statement as exclusive 






appeared somewhat weaker given an abstract context than in other contexts (except 
qualification). Additionally, the proportion of correct answers generally appeared lower in 
abstract contexts than in other scenarios. 

Introducing the THOG 

The THOG problem takes us into a "looking glass world" (Wason, 1978, p. 50), a 
world in which the pairs of features that define THOGness, if combined, create a design 
that is not a THOG. This is one of two contradictions that spur interest in the processes 
leading to solution of the THOG problem. The other contradiction, perhaps one that 
makes that actual solution seem untenable, is that the two THOGs in the problem share 
neither the same shape nor the same color. 

The problem is built around four combinations of two shapes and two colors. 

People are told that one of four designs is a THOG. Given a rule that defines THOGness, 

they are asked to indicate whether each of the other designs is or is not a THOG or 

whether there is insufficient information to make a decision. Typically, only about one 

third or less of adult participants who attempt the standard version of this task correctly 

classify the designs. This level of performance occurs despite explicit reinforcement of 

the dominant exclusive interpretation of "or" by inclusion of the phrase "but not both" in 

the problem statement that follows. 

In front of you are four designs: Black Diamond, White Diamond, Black Circle 
and White Circle. You are to assume that I have written down one of the colours 
(black or white) and one of the shapes (diamond or circle). Now read the 
following rule carefully: If, and only if, any of the designs includes either the 
colour I have written down, or the shape I have written down, but not both, then it 
is called a THOG. I will tell you that the Black Diamond is a THOG. Each of the 
designs can now be classified into one of the following categories: A) Definitely 
is a THOG, B) Insufficient information to decide, C) Definitely is not a THOG. 
(Wason & Brooks, 1979, p. 80). 



At this point, a review of logical steps leading to the correct solution of the THOG 
problem is appropriate. First, knowing that the Black Diamond is a THOG, and 
considering the rule "If, and only if, any of the designs includes either the color I have 
written down, or the shape I have written down, but not both, then it is called a THOG," 
participants who follow algorithmic steps toward solution would first hypothesize the set 
of properties written down by the experimenter. There are two possible sets of properties: 
either black and circle or white and diamond. Second, the identity of each of the other 
three designs would be determined based on the first set of hypothesized properties (black 
and circle) and the rule. The White Diamond is not a THOG because it contains neither of 
these properties. The Black Circle is not a THOG because it contains both of them. The 
White Circle is a THOG because it contains one of the properties (circle) but not the other 
(black). Third, the identity of each of these designs would be determined based on the 
second set of hypothesized properties (white and diamond) and the rule. The results of 
this procedure are the same as those obtained with the black and circle set. Fourth, by 
simultaneously evaluating classifications given under each of the hypotheses, participants 
would reach the conclusion that the White Circle is a THOG, and the Black Circle and 
White Diamond are not THOGs. 

Early Investigations 
Any of four steps toward solution provides a potential juncture for error. To test 
whether people understood the rule, Wason and Brooks (1979, Exp. 1) presented 
participants with four designs, each containing a different combination of two colors and 
two shapes. (They called the target designs CHUZ instead of THOG because the same 
participants later attempted to solve the standard THOG problem.) Participants then wrote 






10 

down their own choice of one color and one shape, thereby constructing their own 
hypotheses. They read the rule and classified each design into one of three categories: A) 
Definitely is a CHUZ, B) Insufficient information to decide, C) Definitely is not a CHUZ. 
All participants in this Constructed CHUZ condition correctly identified all four designs 
based on their preceding choice of color and shape and application of the exclusive 
disjunctive rule. This solution rate demonstrated that adults do understand the rule, and 
can apply it to a single set of properties that they themselves have determined. 

A second possible difficulty is that people cannot correctly identify what the 
experimenter has written down. Wason and Brooks (1979, Exp. 2) presented participants 
with the CHUZ designs, the rule, and a checklist of four combinations of two colors and 
two shapes that the experimenter could have written down. Participants were told that one 
design was a CHUZ, then asked to indicate whether or not the experimenter could have 
written down each of the listed combinations. Sixty-four percent of the participants 
correctly identified which hypotheses could and could not be written down. 

In Experiment 3, Wason and Brooks (1979) showed participants the CHUZ 
designs, the rule, and the positive example. Instead of being given a checklist of possible 
hypotheses, these participants were asked to determine what was written down and 
provide a rationale for their answer. Sixty percent correctly identified the two possible 
hypotheses and backed their answer with appropriate reasons. Another 20% identified the 
hypotheses without adequately supporting their answers. Thus, in the two experiments, 
71% of the participants accurately interpreted the rule, using it to work backwards from 
the positive example to derive the properties that could be written down. Later 
experiments (Girotto & Legrenzi, 1989, Exp. 1; Girotto & Legrenzi, 1993, Exps. 1, 2, & 






11 



3; Griggs, Piatt, Newstead, & Jackson, 1998, Exps. 1, 2, & 3; Smyth & Clark, 1986, 
Exp. 3) replicated these findings regarding elicitation of hypotheses in a variety of 
problem contexts, with correct identification of both possible hypotheses ranging from 
50% to 95%. Clearly, at least a majority of people, typically more, are not stymied by 
either rule interpretation or hypotheses generation. 

However, the series of experiments conducted by Wason and Brooks (1979) also 
illustrated that neither understanding the rule (Exp. 1) nor the ability to identify 
hypotheses (Exp. 2) necessarily translated into an ability to solve the THOG problem. In 
Experiment 1, although all participants correctly identified the CHUZ after constructing 
their own hypothesis, only 28% were then able to solve the standard THOG problem. In 
Experiment 2, only 33% of the participants who spontaneously indicated both possible 
hypotheses in the CHUZ problem subsequently solved the problem. Participants who did 
not spontaneously generate the possible hypotheses were told which ones were correct 
and given an explanation of how they were derived. Prior to attempting to solve the 
CHUZ problem, all claimed to understand the problem at least insofar as the hypotheses 
generation stage, yet none provided the correct answer. Moreover, prior experience with 
this CHUZ problem, partitioned into the stages of hypotheses generation and hypotheses 
testing, did not facilitate subsequent performance on the THOG problem. 

The pattern of errors in these early investigations of the THOG problem 
foreshadowed those in subsequent research. After completing either the Constructed 
CHUZ or a standard CHUZ problem that differed from the THOG only in the assigned 
name and color of designs, participants attempted to solve the THOG problem (Wason & 
Brooks, 1979, Exp. 1). Forty-seven percent of the errors indicated either that White Circle 



12 

was not a THOG and that there was insufficient information to decide about the Black 
Circle and White Diamond (42%) or that the White Circle was not a THOG and that the 
Black Circle and White Diamond were THOGs (5%). Collectively, Wason and Brooks 
(1979) labeled these "intuitive" errors, because they "seem due to a plausible inference 
based on properties of the designs rather than on the hypotheses (p. 84)." Griggs and 
Newstead (1983) differentially labeled these errors Type A (with not-THOG responses 
for the Black Circle and White Diamond) and Type B (with indeterminate responses for 
the Black Circle and White Diamond). 

One possible explanation for these errors is drawn from work on attainment of 
disjunctive concepts (Bruner, Goodnow, & Austin, 1956). In the classic attribute learning 
paradigm, two attributes defined a concept, and each instance contained four attributes. 
Participants were shown a card and told whether or not it was illustrative of the concept. 
Then, they selected additional cards, one at a time. For each card selected, the 
experimenter indicated whether it was a positive or negative instance. After any selection, 
a participant had the option of hypothesizing what the defining attributes were. The task 
was complete when a participant hypothesized the correct attributes. For disjunctive 
concepts, participants frequently adopted an erroneous strategy which, though appropriate 
for attaining conjunctive concepts, led people astray when learning disjunctions. 
Developing their hypotheses based on positive instances, participants proposed that 
features shared by illustrative instances were those that defined the concept. This 
common element fallacy often fails for disjunctive concepts because two members of a 
category may have no feature in common. Bruner et al. also found that if the concept 
attainment task began with a negative rather than a positive example, participants used 



13 

more efficient strategies to reach their initial hypotheses. Bruner et al. contended that the 
negative example encouraged participants to focus on attributes not included in the 
example, thus bypassing the common element fallacy. 

In the THOG problem, given the Black Diamond as a positive example, 
participants committing the common element fallacy hypothesize that the properties 
written down are black and diamond. If they then apply the rule, the Black Circle and 
White Diamond each appear to be THOGs, because each contains one of the 
hypothesized attributes. Because the White Circle contains neither of these properties, it 
does not appear to be a THOG. This response pattern corresponds to Type A errors. 

Another possible explanation for the error patterns in the THOG problem has been 
drawn from studies of conditional reasoning. In propositional logic, the affirmative rule 
"If Neil is studying, then his door is closed" is considered false only if Neil is studying 
and his door is open (the TF case). The statements "if Neil is not studying and his door is 
open" (the FT case) and "if Neil is not studying and his door is not open" (the FF case) 
are both considered true, verifying the rule. Evans (1972) investigated how people 
actually interpreted conditional rules, using stimuli differing in color and shape. He 
hypothesized that people considered the FT and FF cases irrelevant and would not 
designate them as either verifying or falsifying instances. Evans (1972) asked participants 
to construct examples that verified or falsified a variety of conditional rules, some of 
which involved negatives (e.g., "If not p, then q"). The major finding of this research 
overshadowed results concerning rule interpretation. When falsifying rules, people 
attended not to the logical structure but rather to the particular instances named in a rule. 
Their selections matched the named instances. For example, given the rule "If there is not 



14 

a Yellow Diamond on the left, there is a Purple Circle on the right," only 30% of the 
participants correctly chose any color/shape other than the Yellow Diamond and any 
color/shape other than the Purple Circle (the TF case) to falsify the rule. Instead, people 
would initially select the Yellow Diamond on the left and the Purple Circle on the right 
(the FT case) as a falsifying instance. Few (3%) initial selections for falsifying the other 
three rule forms included the FT case, suggesting that its selection for rules with a 
negative antecedent and an affirmative consequent was influenced by matching bias. The 
idea that perceived relevance is linked to items that match those mentioned in the rule has 
also been offered as an explanation for performance on the four-card selection task 
(Evans, 1995). Response patterns on the THOG problem also show evidence of matching, 
but in a somewhat different way. 

Given the Black Diamond as a positive instance, the matching bias explanation 
suggests that people bypass the difficulty of testing multiple hypotheses simultaneously. 
Instead, to simplify the complexity of the task, people compare or match each of the 
designs against the positive example. Because the White Diamond and the Black Circle 
each share only one of the designated THOG properties, people are uncertain about the 
THOGness of these designs. Reflecting this uncertainty, people indicate there is 
insufficient information to make a decision. The White Circle does not match the Black 
Diamond on either attribute dimension. Possessing neither the color nor the shape of a 
Black Diamond, it is classified as "not a THOG." This response pattern corresponds to a 
Type B error. 

Griggs and Newstead (1983) designed a series of variations on the THOG 
problem to explore whether the common element fallacy or matching bias explanation 






15 



best explained its difficulty. They reasoned that use of a negative example (indicating the 
Black Diamond was not a THOG) should facilitate identification of THOGs (the White 
Diamond and the Black Circle) in either case, although subsequent facilitation on a 
standard THOG problem would support a common element fallacy explanation. This 
rationale followed from the understanding that directing focus to a negative instance 
serves to initiate a logical pattern of reasoning, susceptible to transfer. In contrast, if 
people based their responses on matching, then they would reach the correct answers on 
the Not-THOG problem by matching to the example, albeit a negative one. 1 This 
nonlogical strategy would yield a correct answer only for the Not-THOG problem. Griggs 
and Newstead (1983, Exp. 1) found that performance was indeed better on the 
Not-THOG problem than on the standard version. However, participants who worked on 
the Not-THOG problem first were not more likely to solve the standard THOG problem 
than were those who worked on the standard THOG first. 2 The lack of transfer favored 
the matching bias explanation. 

An additional experiment introducing the Anti-THOG problem (Griggs & 
Newstead, 1983, Exp. 2, adopted from Wason, 1978) yielded a conflicting conclusion. In 
this problem, instead of being told that a design is a THOG if it contains one and only one 
of the two features written down, participants were told: "There is a particular color and a 



Among participants who constructed verifying and falsifying cases of conditional rules, 
Evans (1972) found evidence of matching bias stemming from negative components of ' 
antecedents and consequents. Similarly, in selection task research, Evans and Lynch 
(1973) demonstrated that card choices tended to match the components named in the rule, 
regardless of the presence of negatives. 

In one of the two problems attempted by each participant, the shapes and shading of the 
designs were changed, as was the label for the target design (from THOG to CHUZ). 



16 



or 



particular shape such that any of the four designs which has either both these features, 
neither of them, is called a THOG." Note that if the Black Diamond is given as a positive 
example, this rule yields the same answer as the standard rule, i.e., the Black Diamond 
and White Circle are THOGs. Because of the conjunctive nature of the Anti-THOG rule, 
Griggs and Newstead reasoned that if people relied on a common element strategy, then 
performance would be better on the Anti-THOG problem than on the standard version. In 
contrast, matching bias based on the Black Diamond as a positive example would yield 
the same results for both versions. In a between-subjects design, the percentage of correct 
answers was higher for the Anti-THOG problem than for the THOG, supporting use of 
the common element fallacy. 

But there is more to the story. Griggs and Newstead (1983, Exp. 3) devised a third 
more complex problem, the Denial THOG, to determine if matching bias plays a role in 
the difficulty of the THOG. Because the matching bias explanation suggests that people 
base their answers on the positive instance, not on the rule, a matching bias explanation 
predicts the same pattern of errors for a rule with negatives as for the standard rule, if the 
same positive example, the Black Diamond, is used. In the Denial THOG, the rule 
involved three negatives: "If, and only if, a design does not include the color that I have 
written down, or does not include the shape that I have written down, or does not include 
both the color and the shape that I have written down, then it is a THOG." The answer to 
this problem is that there is insufficient information to classify any of the designs, 
because any combination of shape and color other than black and diamond could be 
written down. The classification of the other designs differs depending on what is written. 
The error patterns for the Denial THOG resembled those for the standard THOG, with a 



17 

majority of errors being intuitive, primarily Type B. These results were in accordance 
with matching bias predictions. 

Thus, no firm conclusions could be drawn concerning the common element 
fallacy versus matching bias explanations. However, Griggs and Newstead (1983) 
suggested, based on evidence from the selection task, that people relied on matching 
when no other solution path seemed viable (e.g., in more difficult problems). This 
hypothesis might explain why matching bias was less prominent in the Anti-THOG 
problem which is logically less challenging than the THOG. Subsequent research 
stretched beyond this unresolved controversy to examine whether realism enhanced 
solution rates. 

Introducing Realism 

In other reasoning domains, the influence of realism on performance has been 
equivocal. Newstead, Griggs and Warner (1982) reported that belief bias compromises 
the conclusion that syllogistic reasoning is facilitated by realistic content (citing Wilkins, 
1928). In studies using conditional rules, the influence of concrete content has been 
inconsistent. Heightened performance originally attributed to realism (Johnson-Laird, 
Legrenzi, & Legrenzi, 1972) subsequently appeared to be more appropriately explained 
by memory cueing (Griggs & Cox, 1982). According to this explanation, realistic 
problems prime preexisting knowledge that is then applied to "solve" a problem in lieu of 
logical reasoning. Further, on a disjunctive reasoning task in which the meaning 
expressed by the conjunct of two premises was incongruent with real-life expectations, 
realism tended to impair rather than improve performance (Roberge 1977; 1978, cited in 
Newstead et al., 1982). 



18 



To study the effects of realism on the THOG problem, Newstead et al. (1982) 
investigated performance on four problems similar in structure to the THOG problem. 
The first (Newstead et al., 1982, Exp. 1), adapted from Stainton-Rogers (cited in Wason, 
1978) described the preferences of four women for clothing (jeans and shirts or dresses) 
and music (rock or classical), and designated one woman as having "style" (see 
Appendix A, Newstead et al., 1982, Style, for exact wording). Participants were asked to 
determine which other woman or women had style. The theme of the second problem 
(Newstead et al., 1982, Exp. 2) was eligibility for a third-year psychology course, with a 
prerequisite of one and only one previous course in cognitive psychology. Participants 
were given information about four students, each of whom had completed one of two 
first-year courses (social or cognitive psychology) and one of two-second year courses 
(social or cognitive psychology), then given an example of a student who qualified for a 
third-year course (see Appendix A, Newstead et al., 1982, Psychology, for a more 
complete description). Participants were asked to determine which other student or 
students, if any, qualified for the third-year course. Neither the Style nor the Psychology 
problem produced facilitation compared to abstract versions. 3 

To assess the extent to which prior expectations might boost solution rates, 
Newstead et al. (1982, Exps. 3 & 5) adapted a third problem related to food preferences 
from Cordell (1978; Appendix III, cited in Newstead et al., 1982). The items involved in 



Performance on the Style problem was compared to that on the standard THOG, 
whereas performance on the Psychology problem was compared to that on a problem 
involving letter combinations and an arbitrary "category P." 



19 

the Meat and Gravy problem included two foods (meat or ice cream) and two sauces 
(gravy or chocolate sauce). Participants were given a rule, told that the experimenter 
would eat meat and gravy, then determined whether or not the experimenter would eat 
each of the other food-sauce combinations. The problem was designed so that the other 
edible combination (ice cream and chocolate sauce) coincided with preexisting beliefs 
(see Appendix A, Newstead et al., 1982, Meat and Gravy, for the exact wording of two 
versions.). Both versions of the Meat and Gravy problem produced similar and significant 
facilitation compared to an abstract problem that contained letters and numbers instead of 
foods and sauces (43% correct vs. 0% correct, Exp. 3). Justifications written by 
participants supported the hypothesis that results were attributable to memory cueing 
rather than logical reasoning. However, in Experiment 5, performance on an incongruent 
version of this problem (with meat and chocolate sauce as the answer), was equivalent to 
performance on replications of the original Meat and Gravy version. Moreover, the 
proportion of correct answers to the original Meat and Gravy problem dropped to 20%, 
suggesting that adults' preexisting expectations had only a small influence on realistic 
versions of the THOG. 4 



In contrast, elementary school children (8 to 9 years of age) seemed highly susceptible 
to memory cueing (Newstead et al., 1982, Exp. 4). Given a congruent problem 
comparable to the Meat and Gravy scenario (using pictures of hamburger, mustard, 
pancakes and syrup), 75% of the children responded correctly. Because it was unlikely 
that these children had acquired the ability to combine information from two hypotheses 
(typically demonstrated at about age 1 1 or 12, according to Inhelder & Piaget, 1958), the 
high solution rate suggested that preexisting expectations influenced answer choices. 
Supporting this idea, when a correct response conflicted with prior experience (in the 
incongruent condition), no child solved the problem. 



20 

Subsequent research probed whether the effect of realism was context specific. 
According to Wason (1978), an exclusive interpretation of "or" was particularly 
appropriate in an imperative context. Thus, Griggs and Newstead (1982) devised 
additional problems using imperatives (see Appendix A, Griggs & Newstead, 1982, Drug 
and Diet problems, for exact wording). In the Drug problem, participants read a scenario 
about administering drugs. Four drugs differed in content (calcium or potassium) and in 
mode of administration (oral or injection). Nurses were instructed to give patients one 
injection and one oral medication daily, containing one dose of calcium and one dose of 
potassium. One permissible combination was presented as a positive example, then 
participants determined whether or not each of the other combinations was appropriate. In 
the Diet problem, four ladies in a diet class were instructed to have meat either for lunch 
or dinner, but not both. When they took sandwiches on a picnic, they packed four boxes 
of sandwiches. Two boxes were for lunch and two were for dinner. For each meal, one 
box contained sandwiches with meat, the other sandwiches with cheese. Given a positive 
example of one combination of boxes that conformed to the diet plan, participants 
determined whether each of the other combinations fit the rule. 

All participants did both problems, with order of presentation rotated (Griggs & 
Newstead, 1982, Exp. 1). Compared to the abstract THOG problem (Newstead et al., 
1982), the Drug problem facilitated performance regardless of presentation order. The 
Diet problem, however, facilitated performance only when it was presented second. This 
inconsistency suggested that a factor other than an imperative context was involved in 
facilitation. Griggs and Newstead (1982) posited that this factor related to problem 
structure. In the Drug problem, both divisions of the structural tree (calcium and 



21 

potassium) were clearly specified and linked to the second property (oral and intravenous) 
that in turn was linked to two specific drugs. This linkage was not present in either the 
Diet problem (specifying meat and one meal but not indicating the content of the other) or 
in the standard THOG problem (citing properties written down but not mentioning those 
properties that were not written down). To test the hypothesis that facilitation related to 
explicitly providing information needed to construct a binary symmetrical structural tree, 
Griggs and Newstead (1982, Exp. 3) modified the Diet problem and created a structured 
version of the abstract THOG (see Appendix A, Griggs & Newstead, 1982, for exact 
wording of the Structured Diet problem). 

In the Structured Abstract THOG, participants determined which of four objects, 
each denoted by a nonsense syllable, conformed to a rule about correct combinations. 
Two objects were squares (CHON and THIG) and two were circles (GREF and WULP). 
One square and one circle were black and the others were white. Participants were not 
told which name corresponded to which color. Given a positive example of a permissible 
pair and a rule that a correct combination included one object of each color and one 
object of each shape, participants indicated whether each of the remaining objects 
conformed to the rule. In the Structured Diet problem, the phrasing of the rule was 
changed to indicate that the ladies should have meat for one and only one meal and 
cheese for one and only one meal. Thus, both branches of the structural tree were labeled. 
In a between-subjects design, participants in the experimental groups completed one of 
these problems. To serve as a baseline, participants in a control group completed the 
standard THOG problem. 






22 



Participants who worked on the structured problems performed extremely well. 
Solution rates were 90% for the Structured Abstract THOG, 85% for the Structured Diet 
problem, and 10% for the standard THOG problem. Would this impressive facilitation 
lead to transfer from a structured version to the standard THOG? Griggs and Newstead 
(1982) did not conduct a transfer test between the Structured Abstract THOG and 
standard THOG problems. Despite their shared underlying structure and level of 
abstraction, these two problems appear to require different reasoning processes. For the 
Structured Abstract THOG, the answer can be derived primarily by a process of 
elimination. Given CHON-GREF as a positive example, the CHON-WULP combination 
is disallowed because WULP is a different color than GREF and therefore must be the 
same color as CHON. Similarly, the THIG-GREF combination is disallowed because 
THIG is a different color than CHON and therefore must be the same color as GREF. 
Other than the positive example, the only remaining square-circle possibility is 
THIG- WULP, because THIG must be the opposite color of CHON (given both are 
squares), and WULP must be the opposite color of GREF (given both are circles). Thus, 
THIG- WULP is a permissible combination. This line of reasoning seems markedly less 
complex than the simultaneous evaluation of two hypotheses required in the standard 
THOG, reducing the likelihood of transfer. 

Griggs and Newstead (1982, Exp. 4) did examine transfer between a rephrased 
version of the Drug problem (see Appendix A, Griggs & Newstead, 1982, Rephrased 
Drug, for more details) and the standard THOG. Order of presentation was balanced 
between subjects. Unlike the transfer between the Drug and Diet problems, prior 
experience with a structured problem clearly did not transfer to the standard THOG. 



23 

Although about half of the participants correctly responded to the Rephrased Drug 
problem in each presentation order, the percentage solving the standard THOG ranged 
from 0% to 6%. 

Apparently, the difference in specificity of the branches of the structural tree was 
too great, hampering participants' ability to link the two problems. The elusiveness of 
transfer echoed earlier findings indicating that transfer from one problem-solving task to 
another occurred only when participants clearly recognized structural similarities, an 
effect mediated by problem complexity (Luger & Bauer, 1978; Reed, Ernst & Banerji, 
1974). The Diet problem statement alluded to the binary nature of its structure, although 
the non-meat side of the tree was neither positively defined nor named. Yet even though 
the second brand was not explicit, its presence, perhaps in the description of the 
sandwiches, sufficed to permit transfer from the more explicitly structured Drug problem. 
In contrast, the THOG problem statement made no reference to properties not written 
down, minimizing the probability that participants would recognize the two branches. 
This failure may have led participants to create an inappropriate internal representation 
that blocked subsequent achievement of the correct solution. 

Rather than constructing a complex scenario to imbue the THOG problem with 
realism, Smyth and Clark (1986) selected a real-life example of exclusive disjunction, the 
half-sister relationship, and embodied it in the THOG format. This arrangement permitted 
them to explore the effects of increasing cognitive complexity by comparing transfer to 
the standard THOG problem for each of a series of Half-Sister problems. In Experiment 
1, Smyth and Clark demonstrated that people understood the half-sister relationship (see 
Appendix A, Smyth & Clark, 1986, Half-Sister, for exact problem wording), but that this 






24 

knowledge did not transfer into improved performance on the THOG problem. However, 
Half-Sister wording did not parallel that of the THOG, nor did the Half-Sister problem 
explicitly contain an exclusive disjunctive rule. When Smyth and Clark (Exp. 2) 
rephrased the Half-Sister problem to approximate the phrasing of the THOG problem 
statement, performance relative to the original Half-Sister problem dropped (from 93% 
correct answers to 37% correct answers), even though the relationship was cued with the 
words "my mother" and "my father" (see Appendix A, Smyth & Clark 1986, Cued 
Half-Sister, for exact wording). The decline was attributable to a failure to correctly 
identify women who were not half-sisters. Despite this difficulty, performance on the 
Cued Half-Sister was better than performance on the standard THOG. There was no 
evidence of transfer. 

In a second step toward determining the effects of heightened task complexity 
Smyth and Clark (1986, Exp. 3) developed a problem that did not explicitly state it was 
necessary to assume that either one of the mothers or one of the fathers was written down. 
The problem statement in this Uncued Half-Sister problem also did not provide cues to 
possible combinations of parents that could have been written down (see Appendix A, 
Smyth & Clark, 1986, Uncued Half-Sister, for exact wording). Half of the participants 
who attempted to classify the four women responded to a question "Who could my 
parents be?" prior to classification (Structured Uncued Half-Sister problem). Performance 
on both versions of the Uncued Half-Sister (10% correct) was no better than performance 
on the standard THOG (8% correct). Although about two thirds of the participants 
correctly identified both hypothesized sets of parents if asked to do so, only 15% of this 
group then correctly classified the women. Thus, the difficulty appeared to stem primarily 



25 

from the need to simultaneously evaluate more than one alternative, rather than from an 
inability to generate appropriate pairs, as was true for the CHUZ problem (Wason & 
Brooks, 1979). 

Smyth and Clark (1986) also investigated whether errors on the more complex 
versions of the Half-Sister problem stemmed from inappropriate strategies similar to 
those leading to errors on the standard THOG. To do so, they converted the Not-THOG 
problem (Griggs & Newstead, 1983, Exp. 1) to the Not-Half-Sister problem (Smyth & 
Clark, 1986, Exp. 4), by providing a negative example. Half of the participants who 
attempted to classify the four women on the Not-Half-Sister problem responded to a 
question "Who could my parents be?" prior to classification (Structured Not-Half-Sister 
problem). Forty-eight percent of the participants answered the Not-Half-Sister problems 
correctly on the first presentation of these problems. This solution rate was above the 
solution rates for the standard THOG problem (8%), the Not-THOG problem (25%, 
Griggs & Newstead, 1982), and the Cued and Uncued Half-Sister problems (36% and 
10% respectively). Performance was similar on the Structured and Unstructured 
Not-Half-Sister problems. Reaching a correct solution on these problems failed to transfer 
to subsequent performance on the standard THOG problem. Because participants who 
provided correct answers to the Not-Half-Sister problems were just as likely to make 
nonintuitive as intuitive errors on the standard THOG problem, Smyth and Clark rejected 
the matching bias explanation for the observed facilitation on the Not-Half-Sister 
problems. They suggested instead that the high solution rate reflected a tendency to 
consider the parents of the example conjunctively, simplifying subsequent operations. 



26 

The Era of Separation 

In THOG-type problems, intuitive errors are typically more frequent than other 
types of errors. In the studies reviewed to this point, intuitive errors accounted for 33% 
(Wason & Brooks, 1979) to 60% (Griggs & Newstead, 1983) of all responses, generally 
exceeding the percentage of correct answers. Both types of intuitive error stem from a 
tendency to inappropriately base decisions (either determination of hypotheses or 
classification of designs) on the properties of the positive example. Girotto and Legrenzi 
(1989) proposed that a key to reducing this misleading strategy was to clearly separate the 
data with which people are provided (the designated THOG) from the hypotheses they are 
asked to generate (the properties written down). Girotto and Legrenzi (1989) suggested 
that this separation could be achieved by creating a scenario in which there was a 
temporal separation between the data and hypotheses. In the Two-Level Spy problem (see 
Appendix A, Girotto & Legrenzi, 1989, Two-Level Spy, for exact wording), they devised 
a thematic problem that required generating hypotheses based on an exclusive disjunctive 
rule. To solve the problem, participants were required to alter the properties of the 
positive example. The alteration was designed to "defocus" attention from this example, 
and thereby encourage application of the rule to the hypothesized combinations. 

The story involved four spies, each with two features (type of job and type of visa) 
on their passports (Girotto & Legrenzi, 1989, Exp. 2). To return home in an emergency, 
the spies altered one and only one of the features on their passports. Participants were 
provided with information about the original versions of the passports for each spy and 
told that one of the spies arrived home safely. Participants then determined which of the 
other three spies, if any, also returned home without difficulty. Seventy-five percent of 



27 

the participants correctly solved this problem, compared to only 15% of those who were 
presented with a similar thematic problem (see Appendix A, Girotto & Legrenzi, 1989, 
One-Level Spy, for exact wording) that did not demand modification of the features of the 
positive example, and 29% of those who worked on the standard THOG problem. In the 
latter problems, 79% of the errors were intuitive, primarily Type B. Girotto and Legrenzi 
(1989) hypothesized that the relatively poor performance on the One-Level Spy problem 
resulted from its failure to clearly separate the data (the properties of the original passport 
of the spy who returned home safely) and hypotheses (the properties of the altered 
passports) levels. If this was indeed the case, then facilitation could occur using more 
abstract material if data and hypotheses were adequately separated. 

In the Pub problem, Girotto and Legrenzi (1989, Exp. 3) embedded the colors and 
shapes from the standard THOG into a story about a card game in which the prize was a 
free dinner (see Appendix A, Girotto & Legrenzi, 1989, Pub, for exact wording). One of 
five men dealt himself and each of four friends a card. Each card contained one of the 
four color-shape combinations from the THOG problem. The dealer offered to buy dinner 
for whomever had a card that included either the color or shape of the design on his own 
card, but not both. The person who held the Black Diamond was designated as someone 
for whom the dealer would buy dinner. Participants then decided what card the dealer had 
and whether he would buy dinner for anyone else. Eighty-nine percent of the participants 
solved this problem, demonstrating that alteration of properties was not necessary for 
facilitation. The problem structure provided sufficient incentive to inhibit prolonged 
focus on the positive example (the card held by the person who was owed a dinner) and to 
encourage concentration on the hypotheses (the cards that could be held by the dealer). 



28 

Although the presence of the story reduced the level of abstraction compared to the 
standard THOG, results supported the idea that facilitation was linked to the structure of a 
story, in this case the clarity with which it separated data and hypotheses, rather than the 
introduction of realism. 

Working with abstract material, O'Brien et al. (1990) compared the effects of 
separation of the positive example from the properties that were written down (Trump 
THOG), labeling of properties not written down (Blackboard and Blackboard Control 
THOG), and instruction phrasing (One-Other THOG) on the proportion of correct 
answers. 5 Their work was designed to evaluate the explanation offered for the facilitatory 
effect of the Structured Abstract THOG (Griggs & Newstead, 1982); namely, that 
facilitation stemmed from providing labels for both sides of the structural tree. In the 
Blackboard THOG condition (O'Brien et al., 1990, Exp. 3), participants were told that 
one of the colors and one of the shapes was written on the left-hand side of a blackboard, 
and that the other color and other shape was written on the right-hand side of a 
blackboard. In the Blackboard Control condition, only the items on the left-hand side of a 
blackboard were mentioned. The rule was identical in both cases, stating that if and only 
if the any of the designs included either the color or shape written on the left-hand side of 
the blackboard, but not both, then the design was a THOG. The Blackboard THOG 
problem, explicitly referring to the binary structure, was more frequently solved than the 



O'Brien et al. (1990) used a triangular shape instead of a diamond. However, to avoid 
disrupting comprehension of comparisons between the O'Brien et al. investigations and 
other research, including the experiments conducted for this dissertation, the triangular 
shape is referred to as a diamond throughout this manuscript. 



29 

standard THOG problem (40% versus 5% respectively, in a cross-experiment 
comparison), but the Blackboard Control problem (15% correct) did not significantly 
facilitate performance 6 . These findings provided further support for the value of labeling 
both branches of the structural tree. However, the Trump THOG (O'Brien et al., 1990, 
Exp. 4) demonstrated that labeling both sides of the structural tree was not necessary for 
facilitation. 

The Trump THOG problem was designed to separate the properties of the THOG 
example from the properties written down, as did the realistic Pub and Spy problems 
(Girotto & Legrenzi, 1989). Instead of referring to properties written down, in the Trump 
THOG problem, one of the colors was labeled TRUMP, and one of the shapes was 
labeled FAFNER. A THOG was defined as a design that contained either the color 
TRUMP or the shape FAFNER, but not both. This version of the problem also facilitated 
performance (45% correct), attesting to the effect of separation with abstract material, at a 
level comparable to the effect of labeling both sides of the structural tree. 

In the One-Other THOG problem, the problem statement was identical to that of 
the standard THOG. However, instead of being asked to classify each of the designs, 
participants were explicitly told that only one design other than the Black Diamond was a 
THOG. Their task was to correctly identify the other THOG. Sixty percent of the 
participants did so. However, written justifications revealed that 67% of those who gave 
the correct answer reached their conclusion using some type of exclusion strategy or by 



6 O'Brien et al. asked participants to identify the Black Diamond, as well as the other 
three designs, in all problems except for the task with the one-other instruction. Across 
the Trump, Blackboard, and Blackboard Control THOG conditions, only 5% of the 
participants did not correctly label the Black Diamond as a THOG. 



30 

focusing on uniqueness, without engaging in simultaneous testing of multiple hypotheses. 
This suggested that the high solution rate for the Structured Abstract THOG may have 
been partially attributable to participants who attained the correct answer by a process that 
did not involve testing alternate hypotheses. Subsequent work with separation using 
abstract material (Girotto & Legrenzi, 1993) was subject to a similar effect from 
instruction wording. 

Facilitation via separation suggests that a major difficulty in the THOG problem 
stems from inappropriate focus on the positive example, leading to confusion between its 
properties and the hypothesized properties to which the rule must be applied. In its 
original form, confusion theory proposed that people assumed that the properties of the 
positive example were those that were written down. However, in light of evidence that 
many people who correctly identified the hypothesized combinations (white and diamond 
or black and circle) failed to solve the problem, confusion theory has been extended to 
incorporate any confusion between the hypothesized properties and the properties of the 
positive example (Newstead, Girotto & Legrenzi, 1995). This revision allows for the 
possibility that people correctly identify the hypotheses but then consider the properties 
written down as (incorrectly) exemplifying THOGs rather than as properties that define 
THOGness through the application of the rule. This seems akin to proposing that after 
generating the hypotheses, people bypass applying the rule and select as THOGs those 
design that match the properties written down. Although it is difficult to determine 
whether incorrectly identifying the White Diamond and Black Circle as THOGs stems 
from inappropriate reasoning (considering the written-down properties as examples) or 
perceptual bias (matching the designs to the properties written down), successful 



31 

separation overcomes these difficulties by drawing attention away from the positive 
example. However, Newstead and Griggs (1992) demonstrated that separation provides 
only a partial explanation. 

In their initial replication of the Pub problem, Newstead and Griggs (1992) 
specifically requested that participants respond to the question "Which card do you think 
Charles could have?" prior to determining whether or not each of the other friends would 
receive a free dinner (or if there was insufficient information to decide). Compared to the 
standard THOG problem with an expanded rule, 7 the Pub problem (Newstead & Griggs, 
1992, Exp. 1) significantly facilitated performance (0% versus 41% correct answers 
respectively). However, a direct comparison between performance on this Pub problem 
with a version that omitted the question about Charles' card revealed that performance 
was enhanced by inclusion of the question (53% correct with the question versus 7% 
correct without the question). Thus, the facilitatory effect of the Pub problem was based 
not only on thematic separation but also required explicitly asking a question about the 
hypotheses. 

Newstead and Griggs (1992, Exp. 3) probed the influence of adding a similar 
Question 1 to the standard THOG problem. Prior studies that focused on generating 
hypotheses about the properties written down on an abstract problem (O'Brien at al, 
1990; Smyth & Clark, 1986; Wason & Brooks, 1979) suggested that this should have 
little effect on attaining the correct solution, and indeed it did not. The percentage of 
correct answers was similar whether Question 1 was present (23%) or absent (13%). Also 



7 The rule stated: "If a design includes either the color or the shape I have written down, 
then it is not a THOG. If a figure has neither the color nor the shape, it is not a THOG." 



32 



consistent with earlier work, correctly generating the two possible hypotheses did not 
necessarily lead to correct classification of the designs. Forty percent of the participants 
indicated that either the Black Circle or the White Diamond could have been written 
down, but of these, only 25% then proceeded to solve the problem. Thus, given 
separation, the presence of Question 1, forcing participants to generate hypotheses, was 
required for facilitation; however, without separation, hypotheses generation did not 
inevitably lead to a correct answer. These results required a further modification to 
confusion theory, because simply eliminating the confusion between the positive example 
and the hypothesized properties was not sufficient to produce facilitation, unless people 
initially generated hypotheses. 

Newstead and Griggs (1992) suggested that versions of the THOG that produced 
facilitation (e.g., the Drug, Restructured Diet, Spy, Pub, and Trump problems), either 
directly or indirectly promoted both hypotheses generation and separation. For example, 
in the Drug and Restructured Diet problems (Griggs & Newstead, 1992), clarifying the 
binary nature of the problem may have facilitated hypotheses generation. In the Spy 
problem (Girotto & Legrenzi, 1989), hypotheses generation may have been encouraged 
by the requirement to modify the passports. In the Pub problem (Girotto & Legrenzi, 
1989), hypotheses generation may have stemmed from asking about Charles' card. In the 
Trump problem (O'Brien et al., 1990), labeling the properties may have induced 
hypotheses generation. As Newstead and Griggs (1992) acknowledged, post hoc 
explanation does not provide empirical justification. However, at the least, their 
explanations illustrated that prior instances of facilitation did not preclude the possibility 



33 



that both separation and hypotheses generation jointly contributed to enhanced solution 
rates. 

Newstead and Griggs (1992) noted that even if confusion theory was modified to 
encompass hypotheses generation, it would fail to explain why participants frequently 
indicate that they cannot determine whether or not the White Diamond and Black Circle 
are THOGs (Type B error). According to the theory, confusion would result in the 
classification of these designs as "definitely not THOGs." This assertion appears to 
ignore another source of confusion. If people generate two hypotheses, they may be 
uncertain what to do next. From reading the problem statement, they may realize that only 
one of these two hypotheses is actually written down. Failing to grasp the possibility that 
both hypotheses may lead to the same conclusion, people may resolve the dilemma by 
indicating that the status of the White Diamond and Black Circle is indeterminate. In 
decision-making, Shafir and Tversky (1992) have demonstrated that when faced with 
uncertainty, people tend to postpone a decision. The indeterminate response in the THOG 
problem may be analogous to such postponement. 

If confusion can be reduced by separation, then facilitation should occur with or 
without a realistic context. The Trump THOG (O'Brien et al., 1990) demonstrated such 
an effect, although its 45% solution rate did not reach the 89% level achieved in the Pub 
problem (Girotto & Legrenzi, 1989). However, in addition to a realistic setting, the Pub 
problem combined the two hypothesized properties under a single label (e.g., Charles' 
card), whereas the Trump THOG assigned a label to one color and one shape but did not 
label the hypothesized combination. Perhaps a single label is a prerequisite to facilitation, 
even if data and hypotheses are separated. 



34 



Girotto and Legrenzi (1993) created an abstract problem, SARS, in which the 
hypothesized properties of the designs were combined under the label SARS. Participants 
were presented with the following problem statement. 

In front of you are four designs: Black Diamond, White Diamond, Black Circle, 
and White Circle (see Figure). I have defined one of these designs as a SARS. ' 
You do not know which design this is. But you do know that a design is a THOG 
if it has either the color or the shape of the SARS, but not both. Knowing for sure 
that the Black Diamond is a THOG, you have to indicate which one or which 
ones, among the remaining designs, could be the SARS. (Girotto & Legrenzi 
1993, p. 705) * 

Given the Black Diamond as a positive example of a THOG, participants first 

indicated which of the remaining designs could be the SARS (Question 1). Next, they 

responded to the instruction: "Could you also indicate whether, in addition to the Black 

Diamond, there are other THOGs?" Note that the task requirements stemming from this 

"other" THOG instruction differ from those of the standard THOG instruction 

(classifying each design as either a THOG, not a THOG, or indeterminate). 

Using a between-subjects design, Girotto and Legrenzi (1993, Exp. 1) compared 

performance on the SARS problem to that on the THOG problem with the "other" 

instruction and on the Hypotheses-THOG. In the Hypotheses-THOG, Girotto and 

Legrenzi (1993) introduced Question 1 into the standard THOG problem, asking 

participants to identify the properties written down. This Hypotheses-THOG also 

included the "other" THOG instruction, in lieu of the standard three-choice inquiry. The 

solution rate for the SARS problem (70%) exceeded that for the Hypotheses-THOG 

(40%) and the standard THOG with the "other" instruction (25%). Consistent with 

previous studies involving hypotheses generation, at least half of the participants in the 

SARS (61%) and Hypotheses-THOG (50%) conditions correctly generated two 



35 



hypotheses. Among those who did so, 76% percent in the SARS condition and 60% in the 
Hypotheses-THOG condition correctly identified the designs. Participants who were 
unable to generate at least one hypothesis were also unable to correctly identify the 
designs. 

An examination of the error patterns for the three problems suggested that only 
the SARS problem eliminated considerable confusion leading to intuitive errors. Only 
14% of the incorrect answers in the SARS condition were classified as intuitive errors, 
compared to 58% for the Hypotheses-THOG and 56% for the standard THOG with the 
"other" instruction. This error pattern, together with relative performance on the SARS 
versus the Hypotheses-THOG supported the effectiveness of separation. On the other 
hand, it was possible that after naming two SARS, participants then selected the only 
remaining design, the White Circle, as the other THOG, without performing hypotheses 
tests. 

Attempting to eliminate this alternative explanation, Girotto and Legrenzi (1993) 
conducted two subsequent experiments with the SARS, each with six designs. The two 
additional shapes were a Gray Triangle and a Gray Rectangle. In Experiment 2, the 
problem statement was identical to that of the SARS problem except that it specified six 
shapes rather than four. In the Experiment 3, the problem statement indicated that the 
experimenter had chosen one shape and color, designating the combination a SARS (see 
Appendix A, Girotto & Legrenzi, 1993, SARS 6, Id Color and Shape, for exact wording). 
In this version, the phrasing of Question 1 was also altered, requesting participants to find 
the possible SARS, rather than indicating which of the remaining designs could be 
SARS. However, despite evidence of facilitation (67% and 63% solved the problems in 



36 



Experiments 2 and 3 respectively), it could be argued that the gray color and triangular 
and rectangular shapes were simply considered irrelevant because they did not contain 
any of the properties of either the SARS or the THOG, leaving only the White Circle as 
the other possible THOG. 

Girotto and Legrenzi (1993) did not address the issue of whether Question 1 was 
required to obtain facilitation, nor did they examine the effect of the "other" THOG 
instruction on facilitation, compared to the standard three-choice instruction. In the task 
used by O'Brien et al. (1990) with a one-other THOG instruction, 67% of those who 
correctly identified the White Circle as a THOG used an exclusion strategy or focused on 
uniqueness in lieu of logical reasoning. This sensitivity to information that focuses 
attention on a single design may also explain performance on the SARS problem. 

The possibility of alternative explanations motivated Griggs, Piatt, Newstead and 
Jackson (1998) to conduct a series of experiments in which they systematically 
manipulated the requirement for hypotheses generation (Question 1 present or absent), 
and the wording of the instruction. In the first series of experiments (Exp. 1 and four 
replications), Griggs et al. (1998) compared performance on the SARS problem with and 
without a request to generate hypotheses. Both versions included the standard 
three-choice instruction, rather than the "other" THOG instruction. Among a variety of 
participant populations (ranging from high school students to undergraduates to graduate 
students), Griggs et al. failed to replicate the Girotto and Legrenzi results. Solution rates 
ranged from 0% to 12% in the five studies, with no differences in performance related to 
the presence or absence of Question 1 . Only the "other" THOG instruction (Griggs et al., 



37 

1998, Exp. 3) produced facilitation, with approximately half of the participants 

identifying the White Circle as a THOG. 

Using a 2 x 2 factorial design, Griggs et al. (1998, Exp. 4) also examined 
performance on the SARS and standard THOG problems (both without Question 1) with 
either the one-other or standard three-choice instruction. On both problems, the one-other 
instruction produced facilitation compared to baseline solution rates for the three-choice 
instruction (1 1% on the standard THOG problem and 6% for the SARS problem). There 
was no significant difference between the solution rates for the SARS (72%) and standard 
THOG (53%) problems with the one-other instruction. This pattern of results testified 
that the presence of Question 1 was not necessary for facilitation if the instruction stated 
(O'Brien et al., 1990) or suggested (Girotto & Legrenzi, 1993) that there was one other 
THOG. 

As O'Brien et al. (1990) mentioned, the one-other THOG instruction did not 
consistently enhance logical reasoning. Instead, for many participants, it served to narrow 
down the possibilities, rather than separate the properties of the example and the 
hypotheses. It achieved its effect in abstract problems that did not involve a single label 
for the pairs of hypothesized properties and did not require hypotheses generation. 
Perhaps telling participants that there is only one other THOG discourages intuitive errors 
by implying that both the Black Circle and White Diamond cannot be THOGs. From an 
attentional perspective, when people cannot justify choosing either the White Diamond as 
opposed to the Black Circle or vice versa, they may restrict their search to a design that is 
unique. The White Circle is the only available candidate. 






38 



This attentions cueing explanation (Griggs et al., 1998) suggests that participants 
may "identify the correct designs but for the wrong reasons" (p. 12). If the one-other 
instruction encourages shifting the attentional spotlight to a design that bears a singular 
relationship to the positive example, participants may adopt a strategy of elimination to 
arrive at the correct answer, bypassing the more demanding tests of multiple hypotheses. 
Moreover, the classification task itself is abbreviated by the request to identify one design 
rather than three. Thus, the one-other instruction may serve to reduce cognitive 
complexity in two respects, neither of which necessarily promotes application of the 
appropriate logical reasoning strategies. 

Providing Procedural Cues 
Given a variety of unsuccessful attempts to promote logical facilitation using the 
standard problem context and instructions, Smyth and Clark (1986, Exp. 5) devised an 
ordered set of questions to guide participants through the logical steps leading to the 
solution of the Uncued Half-Sister and the standard THOG problems. For the Uncued 
Half-Sister, prior to indicating whether or not each woman was a half-sister (or whether 
there was insufficient information to decide), participants answered the following 
questions: 

1 . Given that Robin is my half-sister, who could my parents be? 

2. If the first pair of parents you gave in answer to Question 1 actually were my 
parents, which of the other women would be my half-sister? 

3. Did you write down two possible pairs of parents in answer to Question P If 
so, which of the women would be my half-sister if the second pair you choose 
actually were my parents? (Smyth & Clark, 1986, p. 284) 

Similar questions were developed for the colors and shapes in the standard THOG 

problem. No feedback was given regarding answers to these questions. For both the 



39 



Uncued Half-Sister and the standard THOG problems, the effect of the three questions on 
solution rate was minimal. Realism did, however, enhance performance on the three 
preclassification questions. The proportion of correct answers to Question 1 was higher 
for the Uncued Half-Sister problem (78%) than for the THOG problem (47%), as was the 
proportion of correct answers to Questions 2 and 3 (47% for the Uncued Half-Sister 
problem vs. 31% for the THOG problem). However, the solution rate was comparable for 
the two problems (22% for the Uncued Half-Sister problem, and 19% for the THOG 
problem). 

Results highlighted the confusion stemming from dealing with two hypotheses. 
Across both problems, a substantial minority of participants (23%) answered Question 1 
correctly but offered incorrect responses to Questions 2 and 3. In fact, across both 
problems, 19% of the participants answered all three questions correctly, yet failed to 
correctly solve the problem. The addition of realism could not overcome apparent 
confusion resulting from uncertainty regarding which pair of parents or properties was 
relevant to the ultimate classification. 

In the standard THOG context, O'Brien et al. (1990, Exp. 2) attempted to 

ameliorate one source of difficulty by explicitly specifying what combination of 

properties could be written down and providing a rationale for these possibilities. In the 

Pretest THOG problem, after the problem statement, participants read the following. 

Because you know that a THOG has either the shape I have written down or the 
colour I have written down, but not both, it follows that either the colour I have 
written down is black or the shape I have written down is diamond, but not both. 
Further, because there are only two shapes, it follows that if the colour I have 
written down is black, then the shape I have written down is circle; and because 
there are only two colours, it follows that if the shape I have written down is 
diamond, the colour I have written down is white. There are, therefore, two 






40 



possible combinations that I could have written down: diamond and white 
circle and black. (O'Brien et al., 1990, p. 340.) 



or 



Participants then responded to a group of four questions regarding which designs 
could or could not be THOGs given each possible combination (see Appendix A, O'Brien 
et al., 1990, Pretest THOG, for exact wording). The standard three-choice classification 
instruction followed. To reveal possible confusion related to the status of the Black 
Diamond, O'Brien et al. asked participants to classify all four designs. 

Slightly more than half (55%) of the participants correctly answered all four 
pretest questions. In this group, similar proportions of respondents then misidentified the 
Black Diamond as NOT a THOG (45%) and correctly classified all four designs (45%). 
The written justifications of those who indicated that the Black Diamond was not a 
THOG suggested that these participants mistakenly considered black and diamond to be 
the properties that had been written down. Overall, 25% of the participants offered the 
correct solution to the Pretest THOG, compared to the standard THOG baseline of 5% 
(O'Brien et al., 1990, Exp. 1) that also requested classification of all four designs. 
However, in the baseline condition, no one misdentified the Black Diamond, whereas 
45% of the participants did so after responding to pretest questions. This high level of 
misidentification overshadowed the proportion of intuitive errors on the Pretest THOG 
(20%), which was considerably less than the proportion of intuitive errors on the standard 
THOG baseline (55%). 

O'Brien et al. (1990) concluded that many participants harbored the faulty 
assumption that black and triangle were the written-down properties. Explicitly providing 
information that refuted this conception appeared to result in further confusion rather than 



41 



clarification. Further, participants who did understand how to determine hypotheses may 
have lacked the procedural knowledge to follow through with the classification 
procedures, thereby minimizing the effect of knowledge about the hypothesized 
properties. 

In the Explanation THOG (O'Brien et al., 1990, Exp. 2) addressed the issue of 

how to classify designs given each hypothesis. In this condition, participants read a 

detailed rationale that not only identified which properties were written down but also 

explicitly indicated which designs were and were not THOGs given each possibility. This 

explanation, shown below, thereby provided procedural guidance to lead participants to 

the point of comparing the results of design identification across two hypotheses: 

Consider all of the combinations that I might have written down: diamond and 
black, circle and black, diamond and white, and circle and white. I could not have 
written down both diamond and black because the Black Diamond includes both 
of these features. Similarly, I could not have written down both circle and white 
because the Black Diamond contains neither of these features. But the remaining 
combinations each include one of the features of a Black Diamond, and hence I 
could have written down either circle and black, or diamond and white. 

Consider first the possibility that I wrote down circle and black. In this case, a 
Black Circle cannot be a THOG because it includes both features, and a White 
Diamond cannot be a THOG because it includes neither feature. But a White 
Circle is a THOG because it includes one of the features and not the other. 
Consider next the possibility that I wrote down diamond and white. In this case a 
Black Circle cannot be a THOG because it includes neither of the features and 'a 
White Diamond cannot be a THOG because it includes both features. Again, a 
White Circle is a THOG because it includes one of the two features and not the 
other. (O'Brien et al., 1990, p. 340-341) 

The standard three-choice instruction followed this explanation. Participants were 
asked to classify all four designs. 

Participants were unable to use this information appropriately. Those who were 
provided with an explanation did no better at solving the problem than did those who 



42 

attempted to classify the four designs in the standard THOG problem (5% correct answers 
for each). Rather than reducing confusion, the explanation appeared to enhance it. A large 
majority of participants who were presented with the explanation (70%) proceeded to 
classify the Black Diamond as "not a THOG" (60%) or "indeterminate" (10%). As noted 
previously, no participants exhibited this form of confusion in the standard THOG 
condition (O'Brien et al, 1990, Exp. 1). 

O'Brien et al. (1990) reported that written justifications indicated that many 
participants perceived a contradiction in the problem. Some reacted to this by classifying 
the Black Diamond as "not a THOG." In other instances, participants neglected to 
consider the statement identifying the Black Diamond as a THOG, perhaps because they 
were overloaded by the detailed explanation that followed it. O'Brien et al. concluded 
that participants "who do not appreciate the task on their own are not apt to benefit by 
having it explained to them" (O'Brien et al., 1990, p. 343). 

Although it is possible that the lack of facilitation reflected the complexity of the 
particular explanation provided, the Explanation THOG highlights the challenge of 
determining how to overcome or bypass the dual-hypotheses stumbling block. If 
participants did understand the explanation, then its null effect identifies simultaneous 
testing of multiple hypotheses as the locus of major difficulty. However, the low solution 
rate may reflect in part an incremental creation of ambiguity (above that of the standard 
THOG problem) related to the request to classify all four designs. The fact that it was 
necessary to classify the Black Diamond may have created doubt about the veracity of the 
statement indicating that the Black Diamond was a THOG. 



43 



Summary 
Evidence regarding the primary interpretation of disjunctions as inclusive or 
exclusive has been mixed; however, with a strict criterion, an exclusive interpretation 
seems favored (Newstead & Griggs, 1983). Interpretations are also context dependent, 
although with the exception of qualification scenarios, the exclusive interpretation tends 
to predominate. Participants are more consistently geared toward the exclusive meaning 
when they evaluate the conclusions of syllogisms than when they indicate whether an 
outcome is consistent with a disjunctive rule. Because the THOG problem explicitly 
includes the phrase "but not both," it would tend to propel participants toward an 
exclusive interpretation. 

Early investigations of potential sources of difficulty (e.g., Wason & Brooks, 
1979) have revealed that people can understand and apply the exclusive disjunctive rule. 
Given the Black Diamond as a positive example, a majority of people also display 
competence in generating the dyad of hypotheses (the written-down properties), a finding 
that has been consistently replicated (Girotto & Legrenzi, 1989, Exp. 1 ; Girotto & 
Legrenzi, 1993, Exps. 1, 2, & 3; Griggs, Piatt, Newstead, & Jackson, 1998, Exps. 1, 2, & 
3; Smyth & Clark, 1986, Exp. 3). Beyond this point, however, difficulties emerge. 
Intuitive errors suggest that people do not typically proceed to perform the necessary 
analyses based on the hypotheses. Instead, participants often seem distracted by the 
positive example, classifying designs by assessing their similarity to the Black Diamond. 
This primitive matching bias reportedly leads to an answer pattern in which the White 
Diamond and Black Circle have indeterminate status and the White Circle is not a 
THOG. The common element fallacy, an assumption that the properties of the Black 






44 



Diamond are also the hypothesized properties, reportedly results in a mirror image of the 
correct response. Griggs and Newstead (1983) have presented evidence favoring, but not 
consistently supporting, the matching bias alternative. 

Interweaving the THOG problem in realistic scenarios has not consistently 
improved performance (Evans, Newstead, & Byrne, 1993). Moreover, when realism does 
enhance solution rate, the improvement may stem from nonlogical factors such as 
memory cueing (Newstead, Griggs, & Warner, 1982). However, isomorphs of the THOG 
problem that clarify the binary nature of the structural tree do boost solution rates (Griggs 
& Newstead, 1982), as do problems that clarify the separation between the properties 
written down and the positive example (Girotto & Legrenzi, 1989). However, separation 
alone does not appear to be sufficient for facilitation (Newstead & Griggs, 1992); rather, 
the effect of separation seems dependent on prior generation of hypotheses. 

Girotto and Legrenzi (1993) have extended the influence of separation to abstract 
versions of the THOG problem. Explicitly labeling each combination of properties 
written down as a possible SARS boosts accurate identification of the White Circle as a 
THOG, but again this facilitation has been linked to the presence of a request to generate 
hypotheses (Griggs, Piatt, Newstead, & Jackson, 1998). Further, facilitation is dependent 
on the use of an "other" instruction rather than the standard three-choice version (Griggs 
et al., 1998). The "other" instruction may serve to focus attention on the design that has 
not been previously classified as a SARS, thus leading to the correct answer without 
requiring testing of multiple hypotheses. Similar facilitation using the one-other THOG 
instruction (Griggs et al., 1998; O'Brien et al., 1990) supports this hypothesis. 



45 

Efforts to enhance performance by providing procedural clues have had minimal 
effects on solution. Both Smyth and Clark (1986) and O'Brien et al. (1990) have 
introduced a series of questions designed to illuminate logical steps toward correct 
classifications. However, even among participants who correctly answered all questions, 
the ultimate solution remained elusive. Explicitly providing information identifying the 
correct hypotheses and identifying which designs are THOGs and are not THOGs given 
each alternative has also failed to facilitate performance (O'Brien et al., 1990). 

Thus, two decades of research have identified multiple steps along the THOG 
solution path that divert participants from correct answers. First, whereas most people can 
correctly generate two hypotheses, some cannot. Second, people who can correctly 
generate two hypotheses do not consistently identify which designs are THOGs and 
which are not THOGs for each of the two possible combination of hypothesized 
properties. Third, people who can correctly identify the THOGs and not-THOGs for both 
possible combinations of hypothesized properties often cannot accurately combine this 
information to classify the designs. My dissertation research addresses each of these 
difficulties in an attempt to find clues for reducing the cognitive complexity of this 
intriguing problem. To provide a concise framework for my investigation, the following 
section highlights those existing studies that are most directly relevant to my research and 
outlines the design of my series of experiments. 






DISSERTATION FRAMEWORK AND PLAN 
The literature review suggests that participants often appear to react to the 
complexity of the THOG problem by circumventing logical reasoning through the use of 
what Yachanin and Tweney (1982) termed "cognitive short-circuiting." When faced with 
a task requirement that they do not understand or which overburdens working memory, 
participants opt to continue by employing nonlogical but less demanding strategies such 
as matching. In so doing, participants focus on an irrelevant problem feature, or 
reconsider a relevant feature at an inappropriate time. The concept of cognitive 
short-circuiting was initially applied to the matching and verification biases in Wason's 
selection task (Yachanin & Tweney, 1982), but the use of simplified strategies to cope 
with arduous problems has emerged in more generalized hypothesis-testing paradigms as 
well. Doherty, Mynatt, Tweney and Schiavo (1979) noted that rather than face the 
challenge of evaluating data relevant to each of two hypotheses, participants chose to 
examine multiple pieces of information bearing on a single hypothesis. Such 
inappropriate selection of nondiagnostic information (pseudodiagnosticity) was 
positioned as a response to the problem's conceptual difficulty. In an attempt to alleviate 
reliance on use of dysfunctional strategies in the THOG problem, my dissertation aims to 
clarify and expand on prior efforts to reduce its cognitive complexity. 

Confusion between the properties of the positive example and those of the 
hypothesized combinations of properties has been identified as a major roadblock to 



46 



47 



solution in both realistic (Girotto & Legrenzi, 1989) and abstract (Girotto & Legrenzi, 
1993) THOG scenarios. When Girotto and Legrenzi (1989) compared performance on 
two realistic Spy problems, one that explicitly required a transformation of the presented 
properties of passports (Two-Level Spy problem) and another that did not (One-Level 
Spy problem), performance on the Two-Level Spy problem was significantly superior to 
performance on the One-Level version and the standard THOG baseline. Girotto and 
Legrenzi (1989) concluded that the transformation highlighted the distinction between the 
properties of the positive example (those on the original passport of the spy who arrived 
safely in Moscow) and the hypothesized properties (the altered properties that permitted 
safe arrival); furthermore, this separation was designated as the source of facilitation. The 
high solution rate on the Pub problem, devised to achieve separation without the need for 
transformation, offered support for the separation hypothesis, using the same abstract 
designs as in the standard THOG problem but positioning them in a thematic context. 
Thus, it appeared that providing a problem statement that inhibited the tendency to 
conflate the hypotheses and the properties of the positive example was sufficient to boost 
solution rates. However, Newstead and Griggs (1992) discovered a boundary condition 
for the facilitation noted in the Pub problem; namely, solution rates were boosted only if 
people were asked to generate hypotheses. 

A similar pattern of facilitation and identifiable constraints has emerged for 
separation in an abstract context. To clearly differentiate between the designated example 
and the properties written down, Girotto and Legrenzi (1993) created a problem that 
combined each color and shape combination written down under a single label, SARS. 
Prior to identifying the THOG design(s), people were asked to identify the SARS, based 






48 

on the knowledge that (a) the Black Diamond was a THOG, and (b) a THOG had either 
the color or shape of the SARS, but not both. This version of the problem produced 
significant facilitation compared to the standard THOG and a THOG problem that 
required generation of hypotheses (Hypotheses-THOG). However, in lieu of the standard 
three-choice instruction, the SARS problem requested participants to indicate if there 
were other THOGs. Griggs et al. (1998) demonstrated that this "other" instruction, rather 
than the SARS label, was the impetus for facilitation. 

Thus, separation was not sufficient to enhance performance, in either a realistic or 
abstract scenario. If abstract designs were used, facilitation was directly linked to 
instruction wording, suggesting that the instruction may serve to direct attention to an 
aspect of the problem that leads people to the correct answer. This attentional explanation 
was supported by results of investigations using the one-other instruction, initially 
introduced by O'Brien et al. (1990). In their investigation, O'Brien et al. informed 
participants that in addition to the Black Diamond, one other design was a THOG. They 
then asked people to identify the singular THOG. This instruction significantly facilitated 
performance compared to the standard THOG instruction. Griggs et al. (1998) replicated 
this finding and extended it to the SARS problem. If the one-other instruction was used, 
performance on the SARS problem was enhanced even in the absence of a requirement to 
generate hypotheses. O'Brien et al. and Griggs et al. agreed that the enhanced solution 
rate may have stemmed from attentional factors rather than appropriate logical reasoning 
strategies. 

Attentional cueing has been implicated as a source of enhanced performance in 
reasoning tasks other than the THOG problem. For example, in the Wason selection task, 



49 

rule variations devised by Evans, Ball and Brooks (1987) suggested that people attend to 
cards that are mentioned in the rule, making selections based on linguistic cues, without 
necessarily considering the implications of the hidden sides of the cards. If this tendency 
to match was reduced by incorporating a description of all four cards in a rule explication, 
performance was enhanced (Piatt & Griggs, 1993), possibly stemming from an equalized 
distribution of attention. In the THOG problem, an instruction that served to narrow the 
range of candidates for possible THOGs evoked a higher solution rate (Girotto & 
Legrenzi, 1993; Griggs et al., 1998; O'Brien et al., 1990). The first three experiments in 
this dissertation research were designed to replicate and extend findings pertaining to the 
effects of separation, generation of hypotheses, and instruction complexity. 

Experiments la and lb were designed to explore the possibility that narrowing the 
range of classification options may result in facilitation comparable to that achieved by 
the "other" or one-other instructions. In these experiments, the category "insufficient 
information to decide" was eliminated from the standard three-choice instruction for both 
the THOG and SARS problems, with and without Question 1 (a request to generate 
hypotheses). In Experiment la, problem type (THOG or SARS) and number of 
classification options (two-choice or three-choice), all without Question 1 , were 
manipulated using a 2 x 2 factorial design. In Experiment lb, problem type (THOG or 
SARS), number of classification options (two-choice or three-choice), and hypotheses 
generation (Question 1 present or Question 1 not present) were manipulated using a 
2x2x2 factorial design. 

Experiments 2a and 2b were designed to replicate and generalize the combined 
effect of separation via labeling and the use of the "other" instruction in an abstract 






50 



context, initially demonstrated by Girotto and Legrenzi (1993) with a sample of Italian 
technical high school students. Both experiments included the same three conditions used 
by Girotto and Legrenzi ~ the SARS, the Hypotheses-THOG and a standard THOG 
baseline ~ , all of which used the "other" instruction. To determine if the "other" 
instruction served as a source of facilitation, Experiments 2a and 2b also incorporated a 
standard THOG baseline with a three-choice instruction. Participants in Experiment 2a 
were high-ability undergraduates enrolled in a special section of introductory psychology, 
whereas participants in Experiment 2b were drawn from a general pool of introductory 
psychology students. 

To further explore the influence of instruction wording, Experiment 3 extended 
the work of O'Brien et al. (1990) and Griggs et al. (1998) in which participants were 
asked to identify one other THOG. Unlike these preceding investigations, Experiment 3 
included conditions in which participants were requested to generate hypotheses. In 
Experiment 3, problem type (THOG or SARS) and hypotheses generation (Question 1 
present or Question 1 not present) were manipulated using a 2 x 2 factorial design. The 
one-other instruction was employed in all conditions. 

Changing the number of alternatives or the wording of the instruction provides 
participants with linguistic cues that may prompt correct classification of designs or the 
correct selection of one other THOG design. However, these manipulations do not 
necessarily guide participants along the logical solution path, nor do they necessarily 
indicate which steps toward solution are particularly challenging or prone to error. Evans 
(1984, 1989) suggests that in complex reasoning tasks, people may err in a very early 
stage of reasoning by misrepresenting the problem. According to Evans (1984, 1989), 



51 

people may initially make inappropriate distinctions between relevant and irrelevant 
information. These judgments of relevance are made preattentively, in what Evans labels 
a heuristic stage, and may stem from matching to an item named in the problem 
statement. The rapid and preconscious characterization of this heuristic stage has been 
considered analogous to manner in which we assess relevance in language comprehension 
(Evans, 1995). The second stage in Evans' model encompasses analytic processing, 
whereby people apply logical reasoning to make inferences based on the problem 
representation they have constructed. Given the possibility of bias in the heuristic stage, 
sound analytic processing does not guarantee an accurate solution. 

In attempt to bypass misrepresentation of the THOG problem, O'Brien et al. 
(1990) provided participants with a complete explanation of the problem up to the point 
of combining information from the two hypotheses. The explanation indicated which two 
combinations of color and shape could be written down, why these combinations were the 
only logical possibilities, which designs were and were not THOGs given each 
hypothesis, and how decisions regarding design identification were made for each 
hypothesis. After reading the problem statement and explanation, participants classified 
the four designs. The solution rate for this Explanation THOG problem was identical to 
that for the standard THOG baseline, but the error patterns differed for the two problems. 
Providing an explanation increased the possibility that participants would incorrectly 
indicate that the positive example of THOGness, the Black Diamond, was not a THOG, 
or that its identity could not be determined from the information given. Experiment 4a 
was designed to replicate this counterintuitive result and to refine another attempt by 
O'Brien et al. (1990) to guide participants toward the correct solution. 



52 

In the Pretest THOG, O'Brien et al. (1990) offered participants a partial 
explanation and then provided them with structured pretest questions to further direct 
reasoning. After the problem statement, participants were told which two combinations of 
color and shape could be written down and why these combinations were the only logical 
possibilities. Then, participants answered a group of four questions by indicating which 
designs could and could not be THOGs given each possible combination of written-down 
properties. Active involvement in the process of generating design identifications given 
the correct hypotheses did not significantly enhance performance compared to the 
standard THOG baseline. Although about half of the participants correctly answered all 
four pretest questions, less than half of this group then correctly identified the four 
designs. Because grouping questions about multiple hypotheses may not have provided an 
optimal format for encouraging participants to compare answers for the two hypotheses, a 
revised format for the Pretest THOG was introduced in Experiment 4a. Performance on 
the Pretest THOG with all four questions grouped together (Pretest THOG, 
grouped-questions) was compared to performance using the same questions, but with a 
separate space provided for the answer to each (Pretest THOG, split-questions). For 
comparative purposes, Experiment 4a also included a standard THOG baseline. As in the 
O'Brien et al. investigation, participants in all conditions in Experiment 4a were 
instructed to classify all four designs. 

Because the standard THOG problem statement used by O'Brien et al. (1990) 
indicated that the Black Diamond was a THOG, their subsequent request to classify all 
four designs may have created rather than resolved confusion. To investigate this 
possibility, in Experiment 4b, the two Pretest questions concerning which designs could 



53 

be THOGs included the qualifier "other than the Black Diamond." Manipulations also 
addressed two other potential sources of difficulty. It appeared possible that variance in 
responses to the Pretest questions and the concluding classification instruction reflected 
differences in degree of commitment involved. Whereas the classification instruction 
included the standard alternatives (e.g., "definitely a THOG" and "definitely not a THOG" 
together with the indeterminate option), the pretest questions inquired about what design 
or designs could or could not be THOGs. Thus, participants' answers to the pretest 
questions may have been based on a less-than-certain probability. 

This uncertainty may have played a role in subsequent design classification. 
Decision-making investigations have demonstrated that uncertainty about preexisting 
conditions can create an unwillingness to commit to a decision (Shafir & Tversky, 1995; 
Tversky & Shafir, 1993). Logic dictates that if Alternative 1 is preferred over 
Alternative 2 if a specific event occurs, and if Alternative 1 is preferred over 
Alternative 2 if the specific event does not occur, then Alternative 1 should be preferred 
over Alternative 2 even in the absence of information about event occurrence. However, 
violations of this sure-thing principle have been demonstrated in decision-making 
paradigms. Uncertainty about event occurrence leads people to postpone decisions, rather 
than select the alternative that was favored in either case. In the THOG problem, the 
absence of information about which hypothesis is actually written down may adversely 
affect correct classifications. Adding further uncertainty by inquiring about which designs 
could be THOGs might compound indecision. 

To explore the potential differential effects of the word "could" versus the word 
"definitely" on answers to pretest identification questions for each hypothesis and on 



54 

subsequent design classification, Experiment 4b included conditions in which the pretest 
questions asked which design or designs were definitely or definitely not THOGs. In 
addition, because O'Brien et al. (1990) suggested that participants may have assumed that 
black and diamond were the written down properties (despite an explanation to the 
contrary), Experiment 4b included conditions in which a reminder that black and 
diamond could not be written down was inserted prior to the classification instruction. 
Thus, Experiment 4b used a 2 x 2 factorial design, in which the manipulated variables 
were the word "definitely" in the pretest questions (present or not present) and a reminder 
about hypothesized properties (present or not present). 

To summarize my dissertation plan, the first three experiments primarily focused 
on effects of instruction wording and the final experiment was concerned with training. 
Experiments la and lb compared performance with two-choice and three-choice 
instructions on the SARS and THOG problems, with and without a request to generate 
hypotheses. Experiments 2a and 2b evaluated the effects of the "other" instruction on the 
standard THOG, and on the SARS and THOG problems with a request to generate 
hypotheses (based on Girotto & Legrenzi, 1993). Experiment 3 extended the work of 
O'Brien et al. (1990) with the one-other instruction. Experiments 4a and 4b aimed to 
direct attention toward the appropriate solution path, modeled on manipulations by 
O'Brien et al. (1990). Experiment 4a revisited efforts to explain the THOG problem to 
the point of combining the results for the two hypotheses, including conditions that 
introduced four pretest questions to encourage design classification under alternative 
hypotheses. Experiment 4b included modifications to pretest questions to uncover the 
possible influence of uncertainty and to address misconceptions stemming from 






55 



inappropriate selection of hypotheses. Together, these experiments were designed to 
identify key barriers to problem solution and suggest factors involved in facilitating 
performance. 



METHODS, MATERIALS AND RESULTS 
General Procedures 
To verify that each participant was involved in only one experiment, the 
experimenter checked participant code numbers versus a listing of code numbers from 
preceding experiments. In all experiments, participants were randomly assigned to 
conditions within groups. Prior to distributing the problem, the experimenter gave the 
following verbal instructions. 

This experiment is designed to investigate how people solve a deductive 
reasoning problem. You will be given a piece of paper providing instructions for 
the problem and the problem itself. You will be asked to read the instructions, 
then attempt to solve the problem. The problem requires logical reasoning and 
does have a correct answer. I am interested in your best performance, so please 
take your time and do not rush to make a judgment. (From Informed Consent 
form) 

The experimenter then distributed the problems, with problem version 

manipulated between-subjects. All participants completed only one problem, presented on 

one side of an 8-1/2" x 1 1" paper. Appendix B includes copies of materials for all 

problems. Participants were given as much time as they needed to complete the problem. 

All completed it within the 30-minute time slot set aside for each session. Most did so 

within 10 to 15 minutes after beginning work on the problem. 

Experiments la and lb 

The cognitive complexity of the THOG problem may be heightened in the final 

classification stage by inclusion of the option "insufficient information to decide." By 



56 



57 

allowing an escape from firm commitment, this possible choice may mislead some 
participants to the erroneous conclusion that answers depend on an unknown quantity, the 
hypothesis that is written down. Thus, eliminating the potential doubt created by the 
indeterminate option may result in enhanced performance. Experiments la and lb were 
designed to test this possibility. In these experiments, performance on SARS and THOG 
problems with three classification possibilities (definitely a THOG, insufficient 
information to decide, and definitely not a THOG) was compared to performance on 
problems with two classification choices. 
Experiment la 

Participants. Eighty introductory psychology students and 39 upper-division 
psychology students at the University of Florida voluntarily participated in the pilot study. 
Participants from the introductory psychology course completed the two-choice THOG 
(N = 20), three-choice THOG (N = 20), two-choice SARS (N = 20) and three-choice 
SARS (N = 20). Participants from the upper level psychology course completed only the 
two-choice THOG (N = 19) and two-choice SARS (N = 20). Because there were no 
differences related to class level, the data for the two-choice conditions were combined 
prior to analysis. 

Materials. Four versions of the THOG problem were created by manipulating type 
of problem (THOG and SARS) and number of alternatives in the instruction (standard 
three-choice or two-choice, excluding the insufficient information option). Participants 
were required to identify the possible SARS in the SARS conditions but were not 
required to generate the color and shape written down in the THOG conditions. Designs 



58 

were presented vertically at the bottom of the page, following the problem statement and 
instructions. Participants wrote their answer for each design on a blank line to the right of 
the design. 

The wording of the THOG problem statement was identical to that of the standard 
THOG except it began with the phrase "At the bottom of the page" instead of "In front of 
you." The wording of the SARS problem statement was identical to that used by Girotto 
and Legrenzi (1993), through the rule statement with the exception of the "At the bottom 
of the page" phrase. The following question was added prior to the instructions for 
classification: "Knowing for sure that the Black Diamond is a THOG, your first task is to 
indicate which design(s) could be the SARS." This differed from the Girotto and 
Legrenzi question in that it did not specifically state "among the remaining designs." 

Results. Narrowing the range of alternatives enhanced overall solution rates for 
both the THOG and SARS problems. Overall, 34% solved the two-choice versions 
compared to 8% who solved the versions with three classification options, 
X 2 (1, N = 1 19) = 10.02, p_ = .002. In the THOG conditions, 38% of the participants 
solved the problem given the two-choice instruction and 5% solved it given the 
three-choice instruction. In the SARS conditions, 30% of the participants solved the 
problem given the two-choice instruction and 1 0% solved it given the three-choice 
instruction. 

Even though the SARS problems included a request to generate hypotheses and 
the THOG problems did not, performance was similar for both types of problems. Overall 
23% of the participants solved the SARS problems, and 27% solved the THOG problems. 






59 

Sixty-seven percent of the participants in the SARS conditions correctly identified 
both possible SARS combinations in Question 1 (65% given the two-choice instruction 
and 70% given the three-choice instruction). However, among participants who correctly 
generated two hypotheses, only 30% then accurately classified the designs (42% given the 
two-choice instruction and 7% given the three-choice instruction). Among those who did 
not correctly generate two hypotheses, one participant in each condition accurately 
classified the designs. 

Intuitive errors were more prevalent than other types of errors in all conditions 
(ranging from 37% to 45% of all errors). No more than two participants in any condition 
made any other specific error. 
Experiment lb 

Participants. One-hundred-forty-four undergraduates enrolled in introductory 
psychology courses at the University of Florida took part in this experiment to fulfill a 
portion of their experimental participation requirement. Eighteen participants completed 
each of eight versions of the THOG problem described in the Materials section. 

Materials. Eight different versions of the THOG problem were created by 
manipulating type of problem (SARS and THOG), presence of a request to generate 
hypotheses (with Question 1 and without Question 1) and number of alternatives in the 
instruction (three-choice and two-choice). Materials were worded and formatted as in 
Experiment la, with the following exceptions. In the THOG conditions, for versions 
requiring generation of hypotheses, the following question was added prior to the 
instructions for classification: "Knowing for sure that the Black Diamond is a THOG, 



60 

your first task is to indicate which color and shape combination(s) I could have written 
down. In the SARS conditions, for versions that did not require generation of hypotheses 
the following sentence was added prior to classification instructions: "I will tell you that 
the Black Diamond is a THOG." For all versions, six design orders were used. 

Results. Participants performed similarly on the THOG and SARS problems, as 
shown in Table 1 . Given a standard problem with three possible classifications and 
without a requirement to generate hypotheses, 1 1% of the participants solved the THOG 
problem and 6% solved the SARS problem. When required to generate hypotheses, 17% 
solved each problem. 

More participants solved the problems with the two-choice instruction than solved 
those with the three-choice instruction in both the SARS and THOG conditions. Overall, 
33% solved the two-choice versions compared to 13% who solved the versions with three 
classification options, x 2 (1, N = 144) = 8.85, p = .003. In the THOG conditions, 33% of 
the participants solved the problem given the two-choice instruction and 14% solved it 
given the three-choice instruction. In the SARS conditions, 33% of the participants solved 
the problem given the two-choice instruction and 11% solved it given the three-choice 
instruction. Facilitation was more noticeable when participants were also required to 
generate hypotheses, 42% for the two-choice instruction versus 17% for the three-choice 
instruction, % (1, N = 72) = 5.45, p = .020 but also occurred when hypotheses generation 
was not required, 25% for the two-choice instruction versus 8% for the three-choice 
instruction, x 2 (1, N = 72) = 3.60, p = .058. 



61 

Eighty-nine percent of the participants in the SARS conditions with Question 1 
correctly identified both possible SARS combinations, compared to 50% of the 
participants in the THOG conditions who correctly identified the two possible 
color-shape combinations written down, % 2 (1, N = 72) = 12.83, p < .001 . However, 
among participants who correctly generated two hypotheses, only 34% in the SARS 
conditions and 50% in the THOG conditions then accurately classified the designs. 
Among those who did not correctly generate two hypotheses, no participants in the SARS 
conditions and only 6% of those in the THOG conditions accurately classified the 
designs. 

The majority of errors were intuitive. For conditions including the three-choice 
instruction, 50% to 60% of the participants who failed to solved the problems indicated 
that the White Circle was not a THOG, and that the Black Circle and White Diamond 
were THOGs or that there was insufficient information to classify them. Type B errors 
predominated. For conditions including the two-choice instruction, error patterns were 
more varied. Between 38% and 81% of the errors were intuitive, with Type A being the 
only intuitive possibility, given that participants were forced to make a binary "Yes-No" 
decision for each design. 

Summary. Experiments la and lb revealed that narrowing the range of 
alternatives in the instruction led to facilitation. Solution rates appeared higher in 
two-choice than in three-choice conditions for both the THOG and SARS problems. 
Facilitation for the two-choice instruction generally occurred whether or not hypotheses 
generation was required. Solution rates were similar for the THOG and SARS problems, 



62 

mirroring the findings of Griggs et al. (1998). More participants correctly generated two 
hypotheses in the SARS conditions than in the THOG conditions. However, among those 
who correctly generated two hypotheses, solution rates did not differ by problem type. 

Experiments 2a and 2b 

The conditions in these experiments were designed to replicate those of Girotto 
and Legrenzi (1993, Exp. 1) with Italian high-school students. Experiment 2a was based 
on a classroom demonstration with students enrolled in a special honors section of 
introductory psychology. Experiment 2b included undergraduates enrolled in standard 
introductory psychology courses. The identical materials were used in both experiments. 
Each experiment included the three conditions employed by Girotto and Legrenzi (1993), 
plus a baseline condition using the standard THOG problem. 
Experiment 2a 

Participants. Forty-seven undergraduates enrolled in an honors introductory 
psychology course at the University of Florida voluntarily took part in this experiment as 
a classroom exercise. Twelve participants completed the SARS problem, the standard 
THOG with an "other" instruction, and the standard THOG with a three-choice 
instruction, and 1 1 completed the Hypotheses-THOG problem. 

Materials. Four different versions of the THOG problem were used. The first three 
were identical to the problems used by Girotto and Legrenzi (1993, Exp. 1). All included 
the "other" THOG instruction: "Could you also indicate whether, in addition to the Black 
Diamond, there are other THOGs?" The SARS and Hypotheses-THOG problems each 
incorporated a request for generation of hypotheses, whereas the standard THOG did not. 



63 

The fourth condition (not included in the Girotto and Legrenzi experiment) involved the 
standard THOG problem with the three-choice instruction ("definitely a THOG," 
"definitely not a THOG," and "insufficient information to decide"). In all versions, the 
designs were presented horizontally at the top of the page, prior to the problem statement. 
For all versions, three design orders were used. 

Results. A majority of participants (83%) solved the SARS problem by correctly 
identifying the White Circle as a THOG. This solution rate was significantly greater than 
that for the standard THOG with a three-choice instruction (33%), x 2 (1, N = 24) = 6.17, 
P = .013, but only marginally greater than that for the Hypotheses-THOG (45%), 
X 2 (1, N = 23) = 3.63, p = .057, and statistically similar to that for the standard THOG 
with an "other" instruction (58%). Performance in the Hypotheses-THOG condition was 
similar to that in the two standard THOG conditions (which did not differ significantly 
from each other). 

All participants in the SARS condition correctly generated two hypotheses, as did 
73% of those in the Hypotheses-THOG condition. Among participants who correctly 
generated two hypotheses, 83% in the SARS condition and 50% in the 
Hypotheses-THOG condition subsequently identified the White Circle as a THOG. 
Among those who did not correctly generate two hypotheses in the Hypotheses-THOG 
condition, only 33% accurately identified the White Circle. 

Intuitive errors accounted for 50% of all errors on the standard THOG problem 
with a three-choice instruction. Among participants completing the standard THOG with 
an "other" instruction, 60% of those who made errors identified the Black Circle and 






64 

White Diamond as THOGs, and did not classify the White Circle. In the 
Hypotheses-THOG and SARS conditions, there were no intuitive errors. No error pattern 
occurred more than once. Girotto and Legrenzi (1993) also found that only a small 
proportion of the errors in the SARS condition were intuitive (14%). However, intuitive 
errors were predominant in both the Hypotheses-THOG and standard THOG with "other" 
instruction conditions (58% and 56% respectively). 
Experiment 2b 

Participants. Eighty-five undergraduates enrolled in introductory psychology 
courses at the University of Florida took part in this experiment to fulfill a portion of their 
experimental participation requirement. Eighteen participants completed the SARS 
problem, 32 completed the Hypotheses-THOG problem, 1 8 completed the standard 
THOG problem with an "other" instruction, and 17 completed the standard THOG with a 
three-choice instruction. 

Materials. The materials used were identical to those in Experiment 2a. For all 
versions, six design orders were used. 

Results. As shown in Table 2, a majority of participants (61%) solved the SARS 
problem by correctly identifying the White Circle as a THOG. This solution rate was 
significantly greater than that for the standard THOG (three-choice instruction), 1 8%, 
% 2 > H = 35) = 6.88, p = .009, but did not significantly surpass the solution rates for 
either the Hypotheses-THOG or the standard THOG ("other" instruction), 38% and 44% 
respectively. Performance in these latter two conditions was similar, with neither problem 



65 



producing significant facilitation compared to the standard THOG (three-choice 
instruction). 

In the SARS and Hypotheses-THOG conditions, similar proportions of 
participants correctly generated the two hypotheses (72% and 66% respectively). Among 
participants who correctly generated two hypotheses, 85% in the SARS condition and 
52% in the Hypotheses-THOG condition subsequently identified the White Circle as a 
THOG. Among those who did not correctly generate two hypotheses, none of the 
participants in the SARS condition and 9% of those in the THOG conditions accurately 
classified the designs. 

The majority of errors (64%) were intuitive only on the standard THOG 
(three-choice instruction) problem. On the Hypotheses-THOG problem, the predominant 
error was indicating that all designs could be THOGs (45%), with "none could be 
THOGs" and intuitive errors each accounting for 15% of the incorrect answers. On the 
standard THOG ("other" instruction) problem, the predominant error was indicating that 
the Black Circle and White Diamond could be THOGs (40% of all errors), with "all could 
be THOGs" accounting for 30% of the incorrect answers. On the SARS problem, only 
one error pattern occurred more than once. Two participants (29% of those who gave an 
incorrect answer) indicated that the Black Circle was a THOG. 

The pattern of solution rates in the present experiments was generally comparable 
to the results of Girotto and Legrenzi (1993). In Experiments 2a, 2b, and Girotto and 
Legrenzi (1993, Exp. 1), more participants solved the SARS problem than solved either 
the Hypotheses-THOG problem or the standard THOG ("other" instruction) problem, but 



66 

these differences reached significance only in the Girotto and Legrenzi research. In 
Experiments 2a and 2b, performance on the Hypotheses-THOG was equivalent to that on 
the standard THOG ("other" instruction), paralleling the Girotto and Legrenzi findings, 
and to the standard THOG (three-choice instruction), not included in the Girotto and 
Legrenzi investigation. 

Summary. Although more participants solved the SARS problem than solved 
either the Hypotheses-THOG, standard THOG ("other" instruction), and standard THOG 
(three-choice instruction) problems, performance was significantly enhanced in 
Experiments 2a and 2b only versus the standard THOG (three-choice instruction) 
problem. The significant facilitation noted by Girotto and Legrenzi (1993) for the SARS 
problem versus the Hypotheses-THOG problem and the standard THOG problem, all 
with an "other" instruction, did not occur in the present experiments, although the 
solution-rate patterns were generally comparable. Any facilitation attributable to the 
"other" instruction does not necessarily imply that participants evaluated multiple 
hypotheses. Instead, as noted by Griggs et al. (1998), the "other" instruction may have 
encouraged people to solve the problem through a process of elimination, selecting the 
White Circle as a THOG because the Black Circle and White Diamond had been 
classified as SARS. 

Experiment 3 

If participants are adopting an elimination strategy, then reinforcing this strategy 
by explicitly stating there is one other THOG would continue to reveal, and possibly 
increase, facilitation. O'Brien et al. (1990) found such facilitation for a one-other 



67 

instruction on the THOG problem without a request to generate hypotheses. Similarly, 
Griggs et al. (1998) found facilitation for a one-other instruction on both the THOG and 
SARS problems, with neither containing a request to generate hypotheses. Experiment 3 
aimed to assess whether the facilitation attributed to the one-other instruction varied with 
problem type (SARS vs. THOG) and with the request to generate hypotheses (Question 1 
vs. no Question 1). 

Participants. Seventy-three undergraduates enrolled in introductory psychology 
courses at the University of Florida took part in this experiment to fulfill a portion of their 
experimental participation requirement. Eighteen participants completed the THOG 
problem with Question 1 , the SARS problem with Question 1 , and the SARS problem 
without Question 1, and 19 participants completed the standard THOG problem without 
Question 1. 

Materials. Four versions of the THOG problem were created by manipulating type 
of problem (THOG and SARS) and presence of the request to generate hypotheses 
(Question 1 and no Question 1). All four versions used the one-other instruction. The 
wording of the problem statements and of the request to generate hypotheses was the 
same as in Experiment 2b, as was the format. For all versions, six design orders were 
used. 

Results. The one-other instruction enhanced solution rates in all conditions except 
among participants who worked on the standard THOG problem without Question 1 . In 
the other three conditions, performance was comparable (solution rates of 72%, 78% and 
83% for the THOG with Question 1, SARS without Question 1, and SARS with Question 



68 

1 conditions respectively) and significantly higher than the 37% solution rate on the 
standard THOG problem without Question 1 , % 2 (3, N = 72) = 1 1 . 1 8, p = .01 1 . 

Similar proportions of participants who completed the SARS and THOG 
problems correctly generated two hypotheses when requested to do so (90% and 87% 
respectively). The vast majority of those who generated the two hypotheses subsequently 
identified the White Circle as the THOG (88% and 87% for the SARS and THOG 
problems respectively). Among those who did not identify the two hypotheses, one 
participant identified the White Circle as the THOG in the SARS condition and none did 
so in THOG condition. 

Solution rates indicated that the request to generate hypotheses for the SARS 
problem had little effect beyond that of the one-other instruction. In contrast, a high 
solution rate was observed on the THOG problem only when it included both Question 1 
and the one-other instruction. For no apparent reason, these results conflicted with prior 
findings demonstrating that the one-other instruction significantly facilitated performance 
on the standard THOG problem even without a request to generate hypotheses (Griggs 
et al., 1998; O'Brien et al., 1990). 

Experiments 4a and 4b 

Results of prior experiments typically demonstrated that facilitation was elusive 
unless the instruction was worded to focus attention on a limited number of 
classifications or designs to be classified. In Experiments la and lb, such focus was 
attained by reducing the number of alternatives. In Experiments 2a and 2b, facilitation 
stemmed from asking if there were other THOGs, permitting use of an elimination 






69 



strategy. In Experiment 3, telling participants that there was only one other THOG led to 
a higher solution rate. These experiments provided insight regarding how to overcome 
difficulties impeding identification of the White Circle as a THOG; however, they did not 
address how to achieve facilitation on the standard THOG problem with the three-choice 
instruction. Experiments 4a and 4b aimed to affect such facilitation by guiding 
participants to the correct solution. 
Experiment 4a 

O'Brien et al. (1990) were unsuccessful in their attempts to enhance solution rate, 
despite providing either a complete explanation (Explanation THOG) or an explanation 
of what could be written down and questions about design classification for each possible 
hypothesis (Pretest THOG). Experiment 4a was designed to re-examine these 
manipulations. In their Pretest experiment, O'Brien et al. introduced a single grouping of 
four pretest questions. Because it is possible that using a large group of questions 
diminished the effectiveness of these inquiries, Experiment 4a included an additional 
reformatted version of the Pretest THOG in which a separate space was provided for the 
answer to each question. 

Participants. Sixty introductory psychology students at the University of Florida 
took part in this experiment to fulfill a portion of their experimental participation 
requirement. Fifteen participants completed each of four versions of the THOG problem 
described in the Materials section. In addition, fifteen participants from same participant 
pool were included in a replication of the Pretest THOG (split-questions) condition. 



70 

Materials. Four different versions of the THOG problem were used. Consistent 
with O'Brien et al. (1990), all versions required participants to classify all four designs 
(including the Black Diamond) as "Definitely a THOG," "Definitely not a THOG," or 
"Insufficient information to decide." The standard THOG, Explanation THOG, and 
Pretest THOG (grouped-questions) were replications of O'Brien et al. conditions. 
Appendix A provides exact wording for the latter two versions. Wording of the problem 
in the fourth condition, Pretest THOG (split-questions), was identical to the wording of 
the Pretest THOG (grouped-questions). However, space for an answer was provided after 
each question, instead of grouping all four questions together. Designs were presented 
horizontally, prior to the problem statement. For all versions, six design orders were used. 

Results. Providing participants with an explanation that included the correct 
classification of designs given either hypothesis failed to enhance solution rates. 
Congruent with O'Brien et al. (1990), performance on the Explanation THOG (47% 
correct) was no better than performance on the standard THOG baseline (40% correct). 
Table 3 illustrates that error patterns were similar on the two problems as well. However, 
the error pattern for the Explanation THOG in the present study differed from the O'Brien 
et al. findings. A majority of participants (70%) in the O'Brien et al. study did not classify 
the Black Diamond as "Definitely a THOG, " compared to only one participant (7%) in 
Experiment 4a. The O'Brien et al. examination of written justifications suggested to them 
that participants responded as if black and diamond were written down or reacted to a 
perceived contradiction in the explanation by classifying all designs in the "not THOG" 



71 

category. In the present experiment, the explanation appeared to have a null effect, rather 
than compounding confusion. 

Similarly, requesting participants to answer a group of questions by indicating 
which designs could or could not be a THOG given each hypothesis did not enhance 
performance (47% correct) relative to the standard THOG. Of the nine participants (60%) 
who answered all four questions correctly, six (67%) then correctly identified all four 
designs, two (22%) indicated that there was insufficient information to decide about all 
four designs, and one (11%) gave another response. One participant (7%) who indicated 
that both the Black Diamond and White Circle could be THOGs given either hypothesis 
but only gave partial answers (the design that was written down) to pretest questions 
regarding what designs could not be THOGs, then correctly identified all four designs. 
Five participants (33%) did not identify both the Black Diamond and White Circle as 
THOGs in both relevant pretest questions. None of this group solved the problem. 

Performance on the Pretest THOG (grouped-questions) differed from performance 
in the O'Brien et al. (1990) investigation in two respects. First, O'Brien et al. found that 
pretest questions had "some salutary effect" (25% solved the problem) compared to their 
baseline (5% correct), whereas performance on the two versions in the present experiment 
was similar. Second, a seemingly higher proportion of participants in the O'Brien et al. 
study (45%) did not classify the Black Diamond as a THOG, compared to 20% in the 
present study. The proportion of participants answering all four questions correctly in 
Experiment 4a was similar to that in the O'Brien et al. study. 






72 



Separating the four pretest questions to enable respondents to more systematically 
answer and inspect their answers appeared to enhance ability to subsequently classify all 
four designs correctly (73%). However, the level of facilitation did not reach significance 
in either a four-condition comparison, x 2 (3, N = 60) = 3.94, p = .268, or in a direct 
comparison with performance on the Explanation THOG or Pretest THOG 
(grouped-questions), both 47% correct, x* (1, N = 30) = 2.22, p = .136 for both 
comparisons, although it did approach significance compared to performance on the 
standard THOG, 40% correct, x 2 (2, N = 30) = 3.39, p = .065. Correct answers on the 
Pretest THOG (split-questions) problem stemmed in part from participants who no more 
than partially answered at least two of the four pretest questions. For example, three 
participants (20%) who correctly indicated that both the White Circle and Black Diamond 
could be THOGs, but only the design not written down could not be a THOG, then 
correctly identified all four designs, whereas one participant (7%) with a similar pattern 
of responses to the pretest questions did not. One participant (7%) who suggested that if 
White Diamond was written down, then the White Circle could be a THOG and the Black 
Circle could not, but if Black Circle was written down, then the Black Diamond could be 
a THOG and the White Diamond could not, also correctly identified all four designs. 

Seven participants (47%) answered all four questions correctly and subsequently 
correctly classified all four designs; one (7%) incorrectly classified the designs after 
offering correct answers to all four pretest questions. Neither of the two participants 
(13%) who offered another combination of at least partially correct answers to the pretest 
questions solved the problem. 



73 

In a replication of the Pretest THOG (split-questions), 60% of the 15 participants 
correctly identified all four designs. The distribution of answers to pretest questions 
closely resembled that in the original experiment (53% answered all correctly, 33% 
correctly reported that the Black Diamond and the White Circle were THOGs under both 
hypotheses, but only one of the remaining designs was not a THOG, and 13% gave 
another pattern of answers). All eight participants (53%) who correctly answered all four 
pretest questions solved the problem, as did one participant (7%) who correctly 
designated the two possible THOGs for both hypotheses but offered only a single correct 
answer to each of the "not THOG" questions. 
Experiment 4b 

The common element fallacy proposes that participants err on the THOG problem 
because they think black and diamond are the written down properties. If that is the case, 
then applying the rule should reveal a contradiction; namely, that the Black Diamond is 
not a THOG. As noted, to investigate this possibility, O'Brien et al. (1990) asked 
participants to classify all four designs. In their baseline condition, all participants 
correctly classified the Black Diamond; however, when given either a full explanation or 
pretest questions, many concluded that the Black Diamond was not a THOG or not 
classifiable based on the information given. Participants reached this conclusion despite 
the fact that they were specifically told what could be written down. 

Experiment 4b was designed to examine whether or not asking participants to 
classify the Black Diamond actually created confusion that would not typically arise when 
attempting to solve the standard problem. Pretest questions were altered to request 



74 

classification of designs other than the Black Diamond. In addition, some versions 
included a reminder that black and diamond could not be written down, whereas others 
asked which designs could definitely be or definitely not be a THOG given each 
hypothesis. 

Participants. Sixty introductory psychology students at the University of Florida 
took part in this experiment to fulfill a portion of their experimental participation 
requirement. Fifteen participants completed each of four versions of the THOG problem 
described in the Materials section. 

Materials. Four versions of the THOG problem were created by manipulating the 
presence of a reminder (whether or not the problem included the caveat "Reminder: 
Given the rule and the fact that the Black Diamond is a THOG, I could not have written 
down Diamond and Black without creating a contradiction." prior to the classification 
instruction) and the presence of the word "definitely" in the pretest questions (present or 
not present), using a 2 x 2 factorial design. All included four pretest questions, with a 
separate space after each for an answer. In all versions, questions regarding which designs 
could be a THOG included the qualifier "other than the Black Diamond." Designs were 
presented horizontally, prior to the problem statement. For all versions, six design orders 
were used. 

Results. As shown in Table 4, the proportion of correct answers was similar in all 
conditions, ranging from 33% to 47%. Without any further modifications, adding the 
qualifier "other than the Black Diamond" to pretest questions concerning which designs 
could be THOGs appeared to create rather than resolve confusion about the ultimate 



75 

classification of the designs, yielding 47% correct classifications compared to 73% for the 
Pretest THOG (split-questions) in Experiment 4a and 60% in its replication (both without 
the qualifier). Although the decrement in correct classifications was not significant, 
X (1 > N = 45) = 1 .67, p = . 1 97, there was a marginally significant drop in the proportion 
of participants who answered all four pretest questions accurately. In Experiment 4a and 
its replication, 53% of the participants answered all four pretest questions correctly 
compared to 27% who did so in Pretest THOG plus Qualifier condition in Experiment 4b, 
% 2 (1, N = 45) = 2.88, p = .090. The decline appeared to be primarily attributable to fewer 
completely correct identifications of designs that could not be THOGs (even though the 
qualifier was added only to the "could be a THOG" pretest questions). In Experiment 4a 
and its replication, seven participants (53%) correctly indicated the two designs that could 
not be THOGs given either hypothesis, compared to four participants (27%) who did so 
when the qualifier was added to the "could be" questions. In response to the "could be a 
THOG" pretest questions, in Experiment 4a and its replication, 13 participants (87%) 
indicated that both the Black Diamond and White Circle could be THOGs given either 
hypothesis, compared to 1 1 participants (73%) who indicated that the White Circle could 
be a THOG given either hypothesis when the qualifier "other than the Black Diamond" 
was added in Experiment 4b. 

Asking participants which designs definitely could or could not be THOGs did not 
influence the overall proportion of completely correct classifications, compared to 
conditions in which "definitely" was not included. However, in the "definite" conditions, 
fewer participants answered all four pretest questions correctly. Among 49 participants in 



76 

all four conditions who offered either completely or partially correct answers to all pretest 
questions, those in conditions requesting a definite commitment were significantly less 
likely to provide completely correct answers to all four questions (16% of 25 participants) 
than were those in conditions in which a definite commitment was not required (42% of 
24 participants), / 2 (1, N = 49) = 3.95, p = .047. Further, Table 5 shows that in the 
"definite" conditions, a majority of participants who correctly classified the designs 
offered only partially correct answers to the question regarding which designs could not 
be THOGs. Among the 24 participants in all four conditions who correctly classified all 
four designs, these correct classifications were significantly less likely to follow from 
completely correct answers to pretest questions in "definite" conditions (in which only 
1 8% of 1 1 participants offered completely correct pretest answers) than in conditions not 
requiring a definite commitment (in which 61% of 13 participants offered completely 
correct pretest answers), Fisher's Exact Probability test, two-tailed, p = .047. 

Summary. Experiment 4a revealed that neither identifying the hypotheses and how 
they were determined nor providing a complete explanation of the reasoning underlying 
the THOG problem up to the point of comparing the results for the two hypotheses 
facilitated performance if participants were asked to classify all four designs. If questions 
were provided to guide participants toward the solution, encouraging participants to 
answer one question at a time appeared to offer some assistance compared to grouping 
the questions together, but the effect did not attain significance. Experiment 4b indicated 
that the difficulty was not attributable to being asked questions about all four designs 
rather than three nor to forgetting that black and diamond could not be the hypothesized 









77 



properties. Participants who were asked to make definite commitments about what 
designs could not be THOGs under either hypothesis seemed somewhat more apt to 
generate only one possibility rather than two, compared to participants who were asked 
simply what designs could not be THOGs. Further, among participants who correctly 
classified all four designs, those who made definite commitments on pretest questions 
were more likely to derive their correct classifications from only partially correct pretest 
answers than were participants not asked to make definite commitments on pretest 
questions. 






78 



Table 1 

Correct Responses by Problem Type, Hypotheses Generation, and Instruction 

(Experiment lb) 



With hypotheses generation Without hypotheses generation 

Problem type Two-choice Three-choice Two-choice Three-choice 

THOG 7 3 5 2 

SARS 8 3 4 1 

Note. There were 1 8 participants in each condition. 






79 



Table 2 



Responses by Problem Type (Experiment 2b) 





Responses 


Condition 


Correct 


Intuitive error 


Other 


SARS 


11 





7 


Hypotheses-THOG 


12 


3 


17 


Standard THOG 








("other" instruction) 


8 


4 


6 


Standard THOG 








(three-choice instruction) 


3 


9 


5 


Note. There were 1 8 particip 


ants in the SARS and standard THOG ( 


three-choice 



instruction) conditions, 32 in the Hypotheses-THOG condition, and 17 in the standard 
THOG ("other" instruction) condition. 



Condition 



Table 3 



Responses by Problem Type (Experiment 4a) 



Responses 



Black 



Diamond 



Intuitive Near not THOG or 



Correct error insight indeterminate Other 



Standard THOG 


6 


5 





1 


3 


Explanation THOG 


7 


4 





1 


3 


Pretest THOG 













80 



(grouped-questions) 7 

Pretest THOG 



(split-questions) 



11 







Note. There were 1 5 participants in each condition. All conditions included the standard 
three-choice instruction. 



Table 4 

Responses by Problem Type: Pretest THOG (split-questions) with Qualifier 

(Experiment 4b) 



81 



Condition 



Responses 



Correctly Black Diamond 

identified Intuitive Near not-THOG or 

all designs error insight indeterminate Other 



With Qualifier only 

Qualifier + Reminder 3 

Qualifier + 
"Definite"" 

Qualifier + Reminder 

+ "Definite" a ' b 



7 
6 
6 



3 
3 
1 



1 
2 
4 



3 

1 



1 
4 
3 



Note. There were 1 5 participants in each condition. All conditions included the standard 
three-choice instruction. All pretest questions related to identifying THOGs included the 
qualifier "other than the Black Diamond." 

a In Reminder conditions, the following sentence was inserted after the pretest questions 
and before the instruction: "Reminder: Given the rule and the fact that the Black 
Diamond is a THOG, I could not have written down Diamond and Black without creating 
a contradiction." 

b In "Definite" conditions, the pretest questions instructed participants to indicate which 
designs were definitely or definitely not THOGs, rather than asking which designs could 
and could not be THOGs. 



Table 5 

Correct Classifications as a Function of Answers to Pretest Questions 

(Experiment 4b~) 



82 



Condition 



Answers to pretest questions 



Indicated White Circle was 



THOG for both hypotheses 



All 



Correct pretest Not-written Written 

classifi- questions combination combination 

cations correct was not THOG was not THOG 



With qualifier only 
Qualifier + Reminder 3 
Qualifier + "Definite" b 
Qualifier + Reminder 
+ "Definite" a ' b 



47% 


20% 


40% 


33% 


40% 


7% 



33% 



7% 



13% 

7% 

13% 

13% 



13% 


20% 

13% 



Note. There were 1 5 participants in each condition. All conditions included the standard 
three-choice instruction. All pretest questions related to identifying THOGs included the 
qualifier "other than the Black Diamond." 

a In Reminder conditions, the following sentence was inserted after the pretest questions 
and before the instruction: "Reminder: Given the rule and the fact that the Black 
Diamondis a THOG, I could not have written down Diamond and Black without creating 
a contradiction." 

b In "Definite" conditions, the pretest questions instructed participants to indicate which 
designs were definitely or definitely not THOGs, rather than asking which designs could 
and could not be THOGs. 









GENERAL DISCUSSION 
Seven experiments highlight sources of difficulty that thwart solution to the 
THOG problem. Each experiment represented an attempt to relieve complexity at 
different steps along the solution path. In the standard version of the THOG problem, 
there is no explicit request for hypotheses generation, no attempt to differentiate the 
properties of the positive example from those of the hypothesized properties, no 
suggestion to initially classify the designs given each hypothesis, and no indication in the 
response categories that all designs can be accurately classified as either a THOG or not a 
THOG based exclusively on the information provided. Given these multiple sources of 
confusion, people veer away from the route to the solution at different junctures. Some 
ultimately arrive at the correct answer despite detours by resorting to nonlogical 
strategies. Most do not. 

If hypotheses generation is not requested, a substantial minority of participants act 
in a manner taken to indicate that they are unaware of the necessity of this activity. 
Instead, they may adopt nonlogical tactics, such as comparing each design to the positive 
example, then basing THOGness decisions on perceived similarity to the Black Diamond 
(matching bias). With the standard three-choice instruction, such a strategy is reflected by 
the proportion of Type B intuitive errors, e.g., classifying the White Circle as "not a 
THOG" because it shares no property with the Black Diamond, and classifying the Black 
Circle and White Diamond as indeterminate because they each share only one property 



83 






84 

with the Black Diamond. Overall, across five experimental conditions that included the 
standard THOG problem with a three-choice instruction, 27% of 82 participants 
committed a Type B error. 

Unlike the type B error pattern, Type A errors have historically been interpreted as 
reflecting an incorrect attempt at hypotheses generation, namely considering black and 
diamond as the properties written down (the common element fallacy). By then correctly 
applying the disjunctive rule, participants who conjecture that black and diamond are 
written down reach the conclusion that the White Circle is not a THOG and the Black 
Circle and White Diamond are THOGs. Overall, across five experimental conditions 
which included the standard THOG with the three-choice instruction, 16% of 82 
participants committed a Type A error. Error patterns for the SARS problem paralleled 
those for the THOG. In the only SARS condition that included the three-choice 
instruction and did not require hypotheses generation, Type B errors (made by 44% of the 
1 8 participants) were considerably more common than Type A errors (11%). 

To encourage hypotheses generation, versions of the THOG problem have been 
developed that include an initial question asking participants to indicate which color and 
shape combination(s) could be written down. Consistent with prior findings, in five 
experimental conditions in which participants generated hypotheses prior to classifying 
designs, 64% of 97 participants generated the two correct combinations. The proportion 
of participants who did so was lower among those who then responded to the standard 
three-choice instruction (39%) than among those who responded to other instructions 
(two-choice, 61%; "other," 73%; "other" replication, 65%; one-other, 83%). It is possible 
that this range of answers is attributable to random variation; however, an alternative 






85 

explanation may be that participants were not consistently proceeding linearly though the 
problem. As expected, correctly generating hypotheses substantially increased the 
likelihood of correctly classifying designs, but it clearly did not guarantee success. 

An examination of the incorrect answers to hypotheses generation questions 
suggested that the common element fallacy played a only minor role in design 
classification. Across the five THOG conditions mentioned above, only 6% of the 97 
participants reported that only the combination black and diamond could be written 
down, although an additional 10% reported that the combination black and diamond 
could be written down in conjunction with other combinations. The standard three-choice 
condition was the only one in which in which it was possible to determine the ultimate 
classification of all designs across three answer choices. In that condition, one of five 
people who wrote down black and diamond as an hypothesis then committed a Type A 
error. Thus, evidence for explaining Type A errors in the context of the common element 
fallacy is weak. 

Participants in the SARS conditions also readily generated hypotheses when asked 
to do so. Overall, 79% of 144 participants correctly indicated which two designs could be 
SARS. Across seven experimental conditions, a range of 65% to 100% of participants 
offered the two correct hypotheses. Misconstruing the Black Diamond as an hypothesis 
by identifying it as a SARS was a rare phenomenon. Among participants working on 
SARS problems requiring hypotheses generation, only two of 144 participants made this 
error. As was true in the THOG conditions, generation of correct hypotheses did not 
consistently lead to correct classification of designs. Again, the proportion of correct 



86 

classifications appeared more related to specific classification instructions than to an 
ability to determine hypotheses. 

Assignment of the SARS label to the written-down properties appeared, however, 
to positively influence correct generation of hypotheses compared to the standard request 
to indicate the combination(s) that could be written down. Across all types of instruction, 
the SARS label seemed to result in a somewhat higher proportion of correct hypotheses 
generation compared to versions that did not mention the SARS (three-choice, 89% 
versus 39%; two-choice, 89% versus 61%; high ability "other," 100% versus 73%; 
"other," 72% versus 65%; and one-other, 89% versus 83%). Thus, reducing complexity 
by combining two properties under one label did provide some assistance. However, 
differences in hypotheses generation seldom translated into higher solution rates. 

In contrast, the pattern of results related to instructional manipulations revealed a 
clear influence on solution rates. In the present series of experiments, performance given 
the standard three-choice instruction was compared to performance with only two 
choices, with the "other" THOG instruction and with the one-other THOG instruction. As 
choices were narrowed, performance generally improved. For example, across full-scale 
experiments with typical introductory psychology students on the standard THOG 
problem without hypotheses generation, from a baseline of 1 1% to 18% correct, the 
two-choice instruction led to a 28% solution rate, the "other" instruction to a 44% 
solution rate, and the one-other instruction to a 37% solution rate. The latter figure may 
be an anomaly, given the Griggs et al. (1998) finding of 53% correct with a similar 
sample versus their 1 1% baseline. If hypotheses generation was included, performance 



87 

given either the two-choice or "other" options was comparable (39% and 38% 
respectively), but the solution rate with the one-other instruction reached 72%. 

A similar pattern was drawn for the SARS problems, from a baseline of 6% 
correct with a three-choice instruction without hypotheses generation. Comparable figures 
for two-choice and one-other conditions were 22% and 78%, respectively. The SARS 
problems that requested generation of hypotheses also clearly revealed an increase in 
correct solutions as the number of alternatives narrowed: 17% with three choices, 44% 
with two choices, 61% with the "other" instruction, and 83% with the one-other 
instruction. 

The alternative instructions reduce complexity in different ways. The THOG 
problem inherently creates uncertainty regarding which hypothesis is written down. 
Among those participants who do not attempt to test multiple hypotheses or who do not 
realize that answers are the same given either hypothesis, the classification of designs 
may appear dependent on what is written down. Without knowing for sure which specific 
combination is actually written, many participants may opt for the insufficient 
information alternative. In two-choice conditions, removing this option may lead some 
participants toward the realization that there is a definite answer. 

Does the two-choice alternative indirectly reduce uncertainty about which 
hypothesis was written down by implying that it doesn't matter? Or does it more directly 
simplify the problem by narrowing the range of possibilities? If the former is true, then 
participants may more closely examine the possible THOGs and not-THOGs. Among 
those who have actively attempted to determine what is written down, the White Circle 






88 

may appear to be a THOG given either hypothesis. Because there is no indeterminate 
option, the Black Circle and White Diamond may then be classified as not-THOGs. 

This rationale admittedly contains multiple suppositions, as yet unsupported. 
However, the explanation and training experiments (Experiments 4a and 4b) indicate that 
a majority of participants do correctly identify the White Circle as a THOG given either 
hypothesis and tend to make more errors regarding the designs that cannot be THOGs. On 
the other hand, the increased solution rate may also be attributable to improved "odds" for 
guessing. To tease apart alternative explanations might require changing the structure of 
the problem. If participants are asked to consider only one possible combination of 
written down properties, a primary source of uncertainty would be eliminated. Under 
these conditions, if the two-choice instruction remains more effective, its effectiveness 
seems less likely to be attributable to reduction of uncertainty. 

Both the "other" and one-other instructions may achieve their effects by focusing 
attention on a reduced set of possibilities, rather than heightening people's willingness to 
test hypotheses. As O'Brien et al. (1990) mention, the statement that there is only one 
other THOG may encourage participants to choose a design that has a unique relationship 
to the positive example. Because the White Diamond and Black Circle both share one 
feature with the Black Diamond, it may be difficult to justify choosing one instead of the 
other. The White Circle is the only design that does not share any feature with the Black 
Diamond and may be selected as a THOG strictly on the basis of its uniqueness. 

The "other" instruction may operate in a similar manner. For example, many 
participants may reason that because White Diamond and Black Circle are the 
written-down properties, the White Circle is the only remaining candidate for the THOG 






89 

classification. Such a rationale replaces simultaneous evaluation of multiple hypotheses 
with an exclusion strategy. Thus, as Griggs et al. (1998) suggest, the one-other and 
"other" instructions may encourage participants to select the correct answer for the wrong 
reasons. The training conditions provided by the Explanation and Pretest THOG 
manipulations suggest that this may indeed be the case. 

The Explanation THOG provides participants with a detailed rationale for each of 
the hypotheses that could be written down and for the designs that can and cannot be 
THOGs given each hypothesis. Only two pieces of information remained unstated. First, 
the explanation does not explicitly indicate that it is necessary to combine the answers 
from the two hypotheses to achieve a correct solution. Second, the explanation does not 
clarify how to test multiple hypotheses. After reading the problem statement and detailed 
explanation, participants in the O'Brien et al. (1990) experiment performed no better than 
those in their baseline condition. Experiment 4a replicated this counterintuitive null 
result. 

One possible reason for the lack of facilitation relates to the complexity of the 
explanation and the inclusion of potentially misleading information. First, some 
participants may interpret the explanation as containing a contradiction. The explanation 
initially asks participants to consider black and diamond and white and circle among the 
combinations that might be written down but then provides reasons why these 
combinations are not permissible. Second, it mentions that the White Circle is a THOG 
given either combination but fails to mention the Black Diamond. Information about the 
Black Diamond is provided only at the conclusion of the problem statement preceding the 
explanation. Thus, if participants chose to check the explanation when making their 



90 

classifications, they find no explicit statement that the Black Diamond is a THOG. This 
might raise doubts about the status of the Black Diamond, leading to its classification as 
not a THOG or as indeterminate. A vast majority (70%) of the participants in the O'Brien 
et al. experiment reached this conclusion, but in the replication only one participant (7%) 
did so. The reason for this difference is unclear, and may possibly relate to problem 
format. However, what is clear is that leading participants directly to the point of a testing 
multiple hypotheses does not prod them to do so or to consider how to do so correctly. 

Thus, the failure of the Explanation THOG to facilitate performance serves to 
identify the activity of testing multiple hypotheses as a major roadblock to the correct 
solution. If the null effect remains after future experimentation that removes potential 
sources of confusion from the explanation itself, results of the Explanation THOG will 
offer a persuasive rationale for the classification behavior of those who correctly generate 
two hypotheses. Existing data clearly indicate that simultaneously testing multiple 
hypotheses is not a natural activity, e.g., in decision-making, people typically opt to attend 
primarily to one hypothesis when making inferences (Mynatt, Doherty, & Dragon, 1993). 
In the THOG problem, faced with uncertainty concerning which of the two hypotheses is 
written down, participants may abandon any attempt to perform hypotheses tests. Among 
those who do make such an attempt, lack of procedural knowledge may block the correct 
solution (or may lead to correct answers for the wrong reasons). 

Pretest THOG data support this thesis. The Pretest conditions were similar to the 
Explanation THOG because participants were told what combinations of properties could 
be written down and were given a rationale for each. Following the O'Brien et al. (1990) 
procedure, pretest questions in Experiment 4a were worded to encourage participants to 






91 

classify the Black Diamond, as well as the other three designs. However, unlike the 
Explanation THOG, the Pretest versions required participants to make their own 
determinations of which designs could or could not be THOGs given each of the 
hypotheses. If questions regarding design classification given each hypothesis were 
presented as a single group, performance was comparable to baseline, obscuring the small 
effect in the O'Brien et al. study. However, a change in format that allowed individual 
inspection of answers to each of the four questions appeared to facilitate performance, 
although the level of facilitation did not reach significance using the standard .05 
criterion. This may reflect in part the small sample sizes (N = 15 per group). If each 
original group had included 30 participants, a similar pattern of results would have been 
interpreted as revealing statistically significant facilitation for the Split-Questions Pretest 
THOG compared to the standard THOG baseline, Explanation, and Grouped-Questions 
Pretest. A replication of the Split-Questions Pretest with an additional 15 participants 
provided possible evidence for some facilitation (60%), but to a lesser extent than in the 
original experiment (73%). 

Using the Pretest THOG (split-questions) format, qualifying the pretest questions 
concerning which designs could be THOGs by inserting the phrase "other than the Black 
Diamond" failed to produce facilitation and may have added to, rather than relieved, 
potential confusion. Fewer participants correctly answered pretest questions when the 
qualifier was present than when it was absent. Additionally, asking participants to make a 
definite commitment in the pretest questions regarding which designs could and could not 
be THOGs seemed to reduce the proportion of completely correct answers to these 
questions. Instead, participants tended to produce only one of the two possible designs 



92 

that could not be THOGs for each hypothesis. Responses were about equally split 
between those indicating that the combinations written down could not be THOGs and 
those indicating that the combinations not written down could not be THOGs. Thus, 
insertion of the word "definitely" may have uncovered a specific stumbling block to 
solution, even among participants who generated hypotheses. This might help explain the 
indeterminate status of the White Diamond and Black Circle in the most prevalent type of 
intuitive error. 

Across the seven Pretest conditions (including the replication of the Pretest 
THOG (split-questions), 39% of 105 participants answered all four pretest questions 
correctly, 44% correctly identified the White Diamond as a THOG but provided only one 
correct answer to each question regarding which design or designs could not be a THOG, 
and 19% made some other error. Among those who answered all four pretest questions 
correctly, 79% subsequently classified the designs accurately. Additionally, in the group 
that correctly identified the White Diamond but gave only a partial answer to the 
"not-THOG" pretest questions, 41% correctly classified the designs. This percentage 
suggests that a minority block of participants attained the correct answer via a path that 
circumvents appropriate logical inferences. The failure of this group to exhibit uncertainty 
by choosing the insufficient information category for the White Diamond and Black 
Circle suggests that they combined design classifications for the two hypotheses in a 
unique manner. Perhaps people reasoned that if the White Diamond could not be a THOG 
given one hypothesis and the Black Diamond could not be a THOG given the other then 
both of these designs were definitely not THOGs. This implies that these participants 
neglected to consider that only one hypothesis could actually be written down. 



93 

More direct evidence for the inability of some participants to perform the required 
multiple hypotheses tests comes from the responses of those participants who offered 
completely correct answers to all pretest questions but did not accurately classify the 
designs (nine participants, representing 21% of those who correctly answered all pretest 
questions). Error patterns were varied. Two participants indicated that the status of all 
designs was indeterminate. No other response pattern was offered by more than one 
individual. 

Others reached inaccurate conclusions by applying logic to incorrect assumptions. 
In the group that correctly identified the White Circle as a THOG but gave only a partial 
answer to the "not-THOG" pretest questions, 24% proceeded logically (based on their 
responses to pretest questions) to classify the White Circle and Black Diamond as 
THOGs and the White Diamond and Black Circle as indeterminate. The extent of this 
"near insight" illustrates that an incorrect answer does not necessarily indicate the 
complete absence of logical reasoning. On the other hand, it is difficult to reconstruct the 
basis for the intuitive errors committed by 17% of this group. After indicating that the 
White Circle was a THOG given either hypothesis, they then concluded that it was 
definitely not a THOG. Perhaps the confusion resulting from uncertainty regarding the 
classification of the Black Circle and White Diamond led some to abandon logical 
reasoning and resort to a matching strategy at this final stage of problem-solving. 
Supporting this possibility, an additional 9% of this group of participants offered a 
nonintuitive error pattern indicating that the White Circle was not a THOG or that there 
was insufficient information to decide. 






94 

The classification of the White Circle as indeterminate may echo abandonment of 
the sure-thing principle in decision-making in uncertain situations (Shafir & Tversky, 
1993). The sure-thing principle suggests that "if we prefer x to y given any possible state 
of the world, then we should prefer x to y even when the exact state of the world is not 
known" (Shafir & Tversky, 1990, p. 450). Similarly, if the White Circle is a THOG if 
White Diamond is written down and the White Circle is a THOG if Black Circle is 
written down, then the White Circle will be a THOG even if participants do not know 
which of the two combinations is written down. In decision-making, however, when the 
exact state of the world is unknown, people do not consistently act in accordance with the 
sure-thing principle, instead opting to maintain a "wait and see" position by postponing 
the decision. This example of nonconsequentialist reasoning may be analogous to 
selection of the "insufficient information to decide" option in the THOG task. 

Thus, the complexity of the THOG problem in its abstract form renders it resistant 
to facilitation at multiple junctures. This set of experiments clarifies how people err at 
different stages of the problem: generating hypotheses, classifying designs given each 
hypothesis, and combining the results of these initial classifications to reach a logical 
conclusion. Results clearly demonstrate that answers alone provide a misleading picture 
of underlying logical processes. Some people may logically reach an incorrect answer if 
they commit a single error in any of the preceding stages. Others may provide a correct 
answer based on tactics that are inherently nonlogical. 






CONCLUSIONS AND FUTURE DIRECTIONS 
Within the confines of the problem structure of the standard abstract THOG 
problem, consistent facilitation of logical reasoning is ultimately hampered by difficulties 
related to testing multiple hypotheses. Understanding of the exclusive disjunctive rule is 
not a major roadblock. Hypotheses generation can typically be encouraged by explicit 
requests, and a majority of people generally succeed at this task. Separating the properties 
of the positive example from those of the hypotheses by providing a single label is helpful 
at this point. However, hypotheses generation alone does not reliably facilitate solution. 
Given each hypothesis, most people are capable of determining which designs are 
THOGs and, to a somewhat lesser extent, which designs are not THOGs, for both 
possibilities. Even at this stage, however, correct classification of the designs remains 
elusive. Perhaps stemming from the uncertainty surrounding which hypothesis is actually 
written down, people either fail to consider how to combine information from the two 
hypotheses or do so incorrectly. Resistance to simultaneous evaluation of two hypotheses 
appears to be strong. 

Although some instruction wording results in higher solution rates, evidence 
suggests that the facilitation may stem from nonlogical factors. It appears as if people 
resort to heuristical tactics, even after engaging in some form of logical reasoning that 
leads them to a point of confusion. In this case, heuristics such as matching or a strategy 
of elimination are not necessarily preattentive. Thus, they should not be confused with the 



95 



96 

conception of a preattentive heuristic stage that guides or misguides initial selection of 
relevant information (Evans, 1984). Instead, if participants are unfamiliar with procedures 
for logical classification, as most are with respect to multiple hypotheses testing, they 
respond to linguistic cues and approach the classification task from the suggested 
perspective. Without specific training in multiple hypotheses testing, it seems doubtful 
that this difficulty can be overcome. 

Thus, given the requirement for multiple hypotheses testing in the THOG 
problem, its complexity may become an insurmountable challenge. As a result, people 
may resort to the more primitive strategy of matching to the positive example. Testing 
this possibility may involve sacrificing the structural integrity of the THOG by creating 
versions that ask for only one hypothesis and then request classification of designs. If 
performance is reliably stronger and the proportion of Type B errors is reliably smaller in 
single-hypothesis versus dual-hypotheses conditions, findings would provide further 
evidence that the presence of multiple hypotheses enhances uncertainty and presents a 
major roadblock to solution. 

Additionally, because the rationale offered by the Explanation THOG seemed 
overly complex and in effect battled complexity with additional complexity, it appears 
appropriate to test a modified explanation that brings people up to the point of combining 
information from the two hypotheses. If the revised version remains ineffective, then 
identification of multiple-hypotheses testing as the major contributor to confusion will be 
further clarified. Follow-up experiments might then determine the key ingredients needed 
to encourage people to simultaneously evaluate both hypotheses. 









97 

To explore whether facilitation from the "other" and one-other instructions stems 
from logical reasoning, an experiment designed to uncover transfer effects may be 
appropriate. Prior research (Griggs & Newstead, 1983; Smyth & Clark, 1986) has been 
based on the rationale that a manipulation evoking logical reasoning will transfer to 
another problem, if the two problems are similar in structure, whereas a manipulation 
based on task-specific heuristics will not. For example, Griggs and Newstead (1983) 
considered lack of transfer from the not-THOG problem to the standard THOG as 
evidence for use of nonlogical strategies on the not-THOG task. Smyth and Clark also 
found no transfer between several versions of their Half-Sister problem and the standard 
THOG. Transfer effects, although notoriously difficult to attain (but see Griggs & 
Newstead, 1982, Exp. 1, for evidence of transfer between two realistic problems), from 
problems with modified instructions to problems with the standard three-choice 
alternative would refute the hypothesis that benefits from changes in wording are linked 
to nonlogical strategies. Another method for examining the logic question would be to 
specifically request justifications. However, if people approach the problem in accord 
with heuristic/analytic theory (Evans, 1995), then their verbal reports would fail to 
capture preattentive processes that may have led them astray. Moreover, retrospective 
reports are not necessarily a reliable indication of completed thought processes. 

By approaching the issue from both the training and transfer perspectives, 
subsequent renditions of the THOG problem would further illuminate (and perhaps 
ameliorate) specific difficulties people experience as they attempt the task. The challenge 
is to develop a version of the problem that both promotes facilitation and results in 



98 



transfer to the standard abstract THOG. As of now, the cognitive complexity of this 
standard problem has provided a formidable barrier to the success of such attempts. 






REFERENCES 



Braine, M. D., & Rumain, B. (1981). Development of comprehension of "or:" 
Evidence for a series of competencies. Journal of Experimental Child Psychology, 31 , 
46-70. 

Bruner, J. S., Goodnow, J. J., & Austin, G. A. 0956V A study of thinking. New 
York: Wiley. 

Cordell, R. L. (1978). Mature reasoning and problem solving. Unpublished 
University of Nottingham M. Ed. dissertation. 

Doherty, M. E., Mynatt, C. R., Tweney, R. D., & Schiavo, M. D. (1979). 
Pseudodiagnosticity. Acta Psychologica, 43, 1 1 1-121. 

Evans, J. St. B. T. (1972). Interpretation and matching bias in a reasoning task. 
Quarterly Journal of Experimental Psychology, 24, 193-199. 

Evans, J. St. B. T. (1984). Heuristic and analytic processes in reasoning. British 
Journal of Psychology, 75, 451-468. 

Evans, J. St. B. T. (1989). Bias in human reasoning: Causes and consequences. 
Hove, UK: Erlbaum. 

Evans, J. St. B. T. (1995). Relevance and reasoning. In S. E. Newstead & 
J. St. B. T. Evans (Eds.), Perspectives on thinking and reasoning (pp. 147-171). Hillsdale, 
NJ: Erlbaum. 

Evans, J. St. B. T, Ball, L. J., & Brooks, P. G. (1987). Attentional bias and 
decision order in a reasoning task. British Journal of Psychology, 78, 385-394. 

Evans, J. St. B. T., & Lynch, J. S. (1973). Matching bias in the selection task. 
British Journal of Psychology, 64, 391-397. 

Evans, J. St. B. T., & Newstead, S. E. (1980). A study of disjunctive reasoning. 
Psychological Research, 41, 373-388. 

Evans, J. St. B. T., & Newstead, S. E. (1995). Creating a psychology of reasoning. 
In S. E. Newstead & J. St. B. T. Evans (Eds.), Perspectives on thinking and reasoning (pp. 
1-16). Hillsdale, NJ: Erlbaum. 



99 



100 



Evans, J. St. B. T., Newstead, S. E., & Byrne, R. M. J. (1995). Human reasoning: 
The psychology of deduction. Hove, UK: Erlbaum. 

Gazdar, G. (1979). Pragmatics: Implications, presuppositions and logical form. 
New York: Academic Press. 

Girotto, V., & Legrenzi, P. (1989). Mental representation and 
hypothetico-deductive reasoning: The case of the THOG problem. Psychological 
Research. 51, 129-135. 

Girotto, V., & Legrenzi, P. (1993). Naming the parents of the THOG: Mental 
representation and reasoning. Quarterly Journal of Experimental Psychology. 46A, 
701-713. 

Griggs, R. A., & Cox, J. R (1982). The elusive thematic-materials effect in 
Wason's selection task. British Journal of Psychology, 73, 404-420. 

Griggs, R A., & Newstead, S. E. (1982). The role of problem structure in a 
deductive reasoning task. Journal of Experimental Psychology: Learning, Memory, and 
Cognition, 4, 297-307. 

Griggs, R. A. & Newstead, S. E. (1983). The source of intuitive errors in Wason's 
THOG problem. British Journal of Psychology. 74, 451-459. 

Griggs, R A., Piatt, R. D., Newstead, S. E., & Jackson, S. L. (1998). Attentional 
factors in a disjunctive reasoning task. Thinking and Reasoning, 4, 1-14. 

Hurford, J. R. (1974). Exclusive or inclusive disjunction. Foundations of 
Language. 11, 409-411. 

Inhelder, B., & Piaget, J. (1958). The growth of logical thinking from childhood to 
adolescence. New York: Basic Books. 

Johnson-Laird, P. N., Legrenzi, P., & Legrenzi, M. (1972). Reasoning and a sense 
of reality. British Journal of Psychology, 63, 395-400. 

Lakoff, R. (1971). If s and's and but's about conjunction. In C. J. Fillmore and D. 
T. Langendoen (Eds.), Studies in linguistic semantics (pp. 1 14-149). New York: Holt, 
Rinehart & Winston. 

Luger, G F., & Bauer, M. A. (1978). Transfer effects in isomorphic problem 
situations. Acta Psychologica, 42, 121-131. 

Mynatt, C. R., Doherty, M. E., & Dragan, W. (1993). Information relevance, 
working memory, and the consideration of alternatives. Quarterly Journal of 
Experimental Psychology, 46A, 759-778. 






101 



Newstead, S. E., Girotto, V., & Legrenzi, P. (1995). The THOG problem and its 
implications for human reasoning. In S. E. Newstead & J. St. B. T. Evans (Eds.), 
Perspectives on thinking and reasoning (pp. 261-285). Hillsdale, NJ: Erlbaum. 

Newstead, S. E., & Griggs, R. A. (1983). The language and thought of 
disjunction. In J. St. B. T. Evans (Ed.), Thinking and reasoning: Psychological 
approaches (pp. 76-106). London: Routledge & Kegan Paul. 

Newstead, S. E., & Griggs, R. A. (1992). Thinking about THOG: Sources of error 
in a deductive reasoning problem. Psychological Research, 54, 299-305. 

Newstead, S. E., Griggs, R. A., & Chrostowski, J. J. (1984). Reasoning with 
realistic disjunctives. Quarterly Journal of Experimental Psychology, 36A, 61 1-627. 

Newstead, S. E., Griggs, R. A., & Warner, S. A. (1982). The effects of realism on 
Wason's THOG problem. Psychological Research, 44, 85-96. 

O'Brien, D. P., Noveck, I. A., Davidson, G. M., Fisch, S. M., Lea R. B., & 
Freitag, J. (1990). Sources of difficulty in deductive reasoning: The THOG task. 
Quarterly Journal of Experimental Psychology, 42A, 329-351. 

Piatt, R. D., & Griggs, R. A. (1993). Facilitation in the abstract selection task: The 
effects of attentional and instructional factors. Quarterly Journal of Experimental 
Psychology. 46A. 591-613. 

Reed, S. K., Ernst, G. S., & Banerji, R. B. (1974). The role of analogy in transfer 
between similar problem states. Cognitive Psychology, 6, 436-450. 

Roberge, J. J. (1977). Effects of content on inclusive disjunction reasoning. 
Quarterly Journal of Experimental Psychology, 29, 669-676. 

Roberge, J. J. (1978). Linguistic and psychometric factors in propositional 
reasoning. Quarterly Journal of Experimental Psychology, 30, 705-716. 

Shafir, E., & Tversky, A. (1992). Thinking through uncertainly: Nonconsequential 
reasoning and choice. Cognitive Psychology, 24, 449-474. 

Smyth, M. M., & Clark, S. E. (1986). My half-sister is a THOG: Strategic 
processes in a reasoning task. British Journal of Psychology, 77, 275-287. 

Tweney, R. D., Doherty, M. E., Worner, W. J., Pliske, D. B., Mynatt, C. R., 
Gross, K. A., & Arkkelin, D. L. (1980). Strategies of rule discovery in an inference task. 
Quarterly Journal of Experimental Psychology, 32, 109-123. 

Tversky, A., & Shafir, E. (1992). The disjunction effect in choice under 
uncertainty. Psychological Science, 3, 305-309. 



102 



Wason, P. C. (1977). Self-contradictions. In P. N. Johnson-Laird & P. C. Wason 
(Eds.), Readings in cognitive science (pp. 1 14-128). Cambridge: Cambridge University 
Press. 

Wason, P. C. (1978). Hypothesis testing and reasoning. Unit 25, Block 4, 
Cognitive psychology. Milton Keynes: Open University Press. 

Wason, P. C. (1979). Novelty and tradition in cognitive research: A case study. 
Italian Journal of Psychology, 6, 1-7. 

Wason, P C. (1981). Understanding the limits of formal thinking. In H. Parret & 
J. Bouveresse (Eds.), Meaning and understanding (pp. 41 1-421). Berlin: Walter de 
Gruyter. 

Wason, P. C, & Brooks, P. G. (1979). THOG: The anatomy of a problem. 
Psychological Research, 41, 79-90. 

Yachanin, S. A., & Tweney, R. D. (1982). The effect of thematic content on 
cognitive strategies in the four-card selection task. Bulletin of the Psychonomic Society, 
19,87-90. 



APPENDIX A: WORDING OF PROBLEMS 

Problems are grouped in alphabetical order by the surname(s) of the author(s) who 

wrote the publication in which the problems appear. If more than one publication has 

been authored by the same individual(s), then problems from the earliest publication are 

listed first. If there are multiple problems described in a single publication, problems are 

listed in order of the experiments in which the problems appear. Different versions of the 

same problem are listed together. 

Girotto and Legrenzi (1989) 

Two-level spy (Exp. 2) 

"Four Soviet spies are working in London under cover of false jobs and visas. On their 
passports there are the following pairs of official features: 

Type of Job Type of Visa 

(1) Yurj Scientist Tourist 

(2) Ivan Scientist Work 

(3) Boris Reporter Tourist 

(4) Anton Reporter Work 

These official features, in emergency cases, can be modified. The job of scientist can be 
changed to that of reporter (and vice versa), and the tourist visa can be changed to a work 
visa (and vice versa). However, in each passport it is possible to modify only one feature 
(either the type of job or the type of visa, but not both), since, if both features are altered, 
the passport can be detected as false. Moscow is worried about the possibility that the 
London network has been detected. In an emergency, a radio message is sent from 
Moscow, but the reception in London is bad: 'Danger: Come back! In order to get 
through passport control you must have on the passport either the job. . . , or the visa . . . , 
but not both.' Since the spies are not able to decode the message entirely, and are 
frightened, they modify one and one feature only of the passport and rush to the airport. 
Now, you know that: (a) all four spies altered only one feature of their passport; (b) Yurj 



103 






104 



arrived safely in Moscow. Do you think that someone else has been able to escape the 
British secret service? If so, who?" 

One-level spy (Exp. 2) 

The wording of this problem was identical to the wording of the Two-Level Spy problem 

up to the diagram. It then continues: 

"Moscow has chosen one type of job and one type of visa. To return to Moscow, the spies 
have been given the following rule: 'if, and only if, a spy has either the specified job or 
the specified visa, but not both, then the spy can come back to Moscow.' Given that you 
know that Yuri arrived safely in Moscow, do you think that someone else arrived in 
Moscow? If so, who?" 

Pub (Exp. 3) 

"Five friends meet every night in the pub. One night, Charles decides to play a game. T 
have brought a deck of cards. It contains only these four types of cards (four designs are 
presented horizontally). 

I deal one for myself from the deck, and I won't show it to you. Now, I'll deal each of you 
a card, and I'll pay for a dinner for each person who has a card including either the color 
or my card, or the shape of my card, but not both.' 

The following are the cards of Charles's friends (four designs with names underneath are 
presented in a different order than the first time that types of cards were shown): 

Charles continues: 'Without showing you my card, I can tell you that I owe Rob a dinner. 
Which card do you think I could have? And do you think that I have to pay for a dinner 
for someone else? If so, for whom?'" 

Girotto and Legrenzi (1993) 

Hypotheses-THOG (Exp. 1) 

After reading the standard THOG problem statement, participants were requested to 

identify the hypothetical features written down by the experimenter: 

"Knowing for sure that the black diamond is a THOG, you have to indicate which 
combination of shape and color I could have written down." 

Afterwards, they were requested to solve the standard THOG problem, with the following 
revised instruction: 









105 



"Could you also indicate whether, in addition to the black diamond, there are other 
THOGs?" 

SARS (Exp. 1: also in Griggs et al. 1998. Exp. 3) 

"In front of you are four designs: black diamond, white diamond, black circle, and white 
circle (four designs are shown). I have defined one of these designs as a SARS. You do 
not know which design this is. But you do know that a design is a THOG if it has either 
the color or the shape of the SARS, but not both. Knowing for sure that the black 
diamond is a THOG, you have to indicate which one or which ones, among the remaining 
designs, could be the SARS. 

Could you also indicate whether, in addition to the black diamond, there are other 
THOGs?" 

SARS 6 -Id Color and Shape (Exp. 3) 

"In front of you are six designs composed of four different shapes (circle diamond 
rectangle, and triangle) and of three different colors (black, white, and gray). (Six designs 
are shown.) 

I have chosen one shape and one color. This pair of properties, unknown by you, is called 
SARS. You do know that a design is called THOG if it includes either the shape of the 
SARS or the color of the SARS, but not both. Knowing for sure that the black diamond is 
a THOG, you have to indicate which combination of color and shape I could have chosen 
- that is, you have to find the possible SARS. 

Could you also indicate whether, in addition to the black diamond, there are other 
THOGs?" 

Griggs and Newstead (1982) 

Diet (Exp. 1) 

"Four ladies, Joan, Mary, Helen, and Susan, joined a dieting class. The diet that they were 
to follow was one which controlled strictly the amount of meat they should eat to two 
ounces a day, all of which had to be consumed at one meal. The instructor told them, 
'You must have the two ounces of meat either for lunch or for dinner, but you must not 
have meat for both lunch and dinner.' 

Shortly afterwards, the ladies decided to go out for a day's picnic. They took some 
sandwiches with them, and so that they would stick to their diet, they took two types of 
sandwiches for lunch. They put the sandwiches in boxes labeled 1 and 2. One of these 
boxes contained roast beef sandwiches, the other cheese sandwiches. For dinner they 
packed two more boxes, 3 and 4. One of these boxes contained ham sandwiches, the other 
cucumber sandwiches. 



106 



After their picnic outing, the ladies were discussing what sandwiches they had eaten. 
They found that they had all taken their sandwiches from a different combination of 
boxes. These were the combinations they had eaten: 





Lunch Sandwich 


Dinner Sandwich 


Joan 


From Box 1 


From Box 3 


Mary 


From Box 1 


From Box 4 


Helen 


From Box 2 


From Box 3 


Susan 


From Box 2 


From Box 4 



Joan had eaten a combination of sandwiches that satisfied the instructions given by their 
dieting instructor." 

The participants' task was to classify the sandwich combinations eaten by the other 

ladies. 

Structured Diet (Exp. 3, 4) 

This problem was similar to the Diet problem, except that the dieting ladies were told to 

have meat for one and only one meal, "and to have cheese for one and only one meal." 

Drug (Exp. 1,2) 

"Dr. Robinson was instructing some trainees nurses on how to administer drugs. He was 
talking about kidney diseases and told the nurses that renal patients required carefully 
controlled intakes of calcium and potassium. The best way of administering these was by 
two injections daily, but patients became very sore with this number of injections. Thus, it 
was hospital policy to administer one drug intravenously and one orally. The doctor 
emphasized, 'You must give the patients potassium either in an injection or orally every 
day, but of course you must not give them both the potassium injection and the potassium 
pill. Similarly, you must give the patients calcium but not both the calcium injection and 
the calcium pill.' 

The nurses were then told, as a class exercise, to decide what brand name of drugs they 
would select to administer to patients. They were told to choose some combination of the 
drugs Deroxin and Altanin (which are intravenous drugs, one containing calcium and the 
other potassium) with the drugs Prisone and Triblomate (which are orally administered 
drugs, one of which contains calcium, the other potassium). 






107 



At the next class, Dr. Robinson was surprised to find that the class had produced as 
answers all the possible combinations of the drugs: 





Injection 


Drug 


Answer 1 : 


Deroxin 


Prisone 


Answer 2: 


Deroxin 


Triblomate 


Answer 3: 


Altanin 


Prisone 


Answer 4: 


Altanin 


Triblomate 



Dr. Robinson got as far as telling the class that the combination in Answer 1 conformed 
to his instructions when he was called away to do an emergency operation. Hence, the 
student nurses had to work out for themselves whether they were right or wrong." 

Participants then had to classify the remaining combinations as to whether or not they 

conformed to Dr. Robinson's instructions or whether there was insufficient information 

to decide. 

Rephrased Drug (Exp. 2) 

The Rephrased Drug problem incorporated several minor changes to the DRUG problem 

to make respective links between drugs and elements more unlikely. The primary change 

was the addition of a statement that said explicitly that the nurses did not know which 

drug contained which element. 

Newstead, Griggs and Warner (1982) 
Style (Exp. 1) 

"At the bottom of the page are descriptions of four women: Suzy, Jane, Amy, and Mary. 
Their preferences for a particular kind of clothing and a particular kind of music are 
given. 

You are to assume that I have written down one of the types of clothing (jeans and shirts, 
or dresses), and one of the types of music (rock or classical). Now read the following rule 
carefully: 

If and only if a woman's description includes either the type of clothing I have written 
down, or the type of music I have written down, but not both, then she is said to have 
style. 






108 



I now tell you that Suzy (who likes rock music and wears dresses) has style. This does not 
mean that rock music and dresses are the things I have written down. It only means that 
Suzy has style according to the above rule and the types of music and clothing I have 
written down." 

The participants' task was to classify the other three women as to whether they did or did 

not have style, or whether there was insufficient information to decide. 

Psychology (Exp. 2) 

The problem involved determining the eligibility of psychology students to enroll in a 

particular third-year psychology course (Course P), given the following restriction: 

"Because it is a cognitive psychology course, students are required to have taken one 
previous course in the area. However, in order to avoid overspecialization, students who 
have taken two cognitive psychology courses are excluded from Course P." 

Participants were told that each of four students had taken a cognitive or social 

psychology course each year and which of two first-year and two second- year courses 

four students had taken, but were not told which courses were social or cognitive. They 

were also told that one student was eligible to take course P. 

The participants' task was to decide on the eligibility of the other three students. 

Meat and Gravy (Exp. 3) 

Version 1. "A friend of mine has written down two items of food: a solid (either 
meat or ice cream) and a sauce (either gravy or chocolate sauce). If I were offered a meal 
which contains just one of the items he has written down, then I would eat it. However, if 
I were offered a meal containing both the items he has written down, I would definitely 
not eat it. One thing is for sure: I would eat meat and gravy." 

Version 2. "A friend of mine has written down two items of food: a solid (either 
meat or ice cream) and a sauce (either gravy or chocolate sauce). If I were offered a meal 
which contains either of the items he has written down, then I would eat it. However, if I 
were offered a meal containing both the items he has written down or neither of the items 
he has written down, I would not eat it. One thing is for sure: I would eat meat and 
gravy." 









109 



Smyth and Clark (1986) 

Half-Sister (Exp. I s ) 

"At the bottom of this page are the names of four women and their parents. My father and 
George are two different men, and my mother and Jane are two different women. If Robin 
is my half-sister which of the others is also my half-sister? Please classify each of the 
other three women according to this classification: A. Definitely my half-sister. B. 
Definitely not my half-sister. C. There is insufficient evidence on which to base a 
decision." 

The following names and pairs of parents were listed: 

Robin, my father and Jane 
Val, George and Jane 
Kate, George and my mother 
Jo, my father and my mother 

Cued Half-Sister (Exp. 2) 

"At the bottom of this page are descriptions of the parents of four women. You are to 
assume that I have written down one of the mothers (Jane or my mother) and one of the 
fathers (my father or George). Now read the following rule carefully: If, and only if, the 
description of the woman's parents includes either the mother I have written down, or the 
father I have written down, but not both, then that woman is my half-sister. I now tell you 
that Robin is my half-sister." 

Classification instructions, the names of the woman, and the pairs of parents were the 
same as in the Half-Sister problem. 

Uncued Half-Sister (Exp. 3, 5) 

"At the bottom of this page are the names of four women and their parents. One of the 
fathers (Tom or George) is my father and one of the mothers (Jane or Mary) is my 
mother. If, and only if, a woman's parents include either my mother, or my father, but not 
both, then that woman is my half-sister." 

Classification instructions were the same as in the Half-Sister problem. The names of the 

woman, and the pairs of parents were as follows: 

Robin, George and Jane 
Val, Tom and Jane 
Kate, Tom and my mother 
Jo, George and my mother 






110 

O'Brien etal. (1990) 

Explanation THOG (Exp. 2) 

The problem statement and figure showing designs are the same as in the standard THOG 

problem. Before the instruction, participants read the following explanation: 

"Consider all of the combinations that I might have written down: Triangle and Black, 
Circle and Black, Triangle and White, and Circle and White. I could not have written 
down both Triangle and Black because the Black Triangle includes both of these features. 
Similarly, I could not have written down both Circle and White because the Black 
Triangle contains neither of these features. But the remaining combinations each include 
one of the features of a Black Triangle, and hence I could have written down either Circle 
and Black, or Triangle and White. 

Consider first the possibility that I wrote down Circle and Black. In this case, a black 
circle cannot be a THOG because it includes both features, and a white triangle cannot be 
a THOG because it includes neither feature. But a white circle is a THOG because it 
includes one of the features and not the other. Consider next the possibility that I wrote 
down Triangle and White. In this case, a black circle cannot be a THOG because it 
includes neither of the features, and a white triangle cannot be a THOG because it 
includes both features. Again, a white circle is a THOG because it includes one of the two 
features and not the other." 

Participants are then requested to classify the designs. 

Pretest THOG (Exp. 2) 

After the standard THOG problem statement and figure showing designs, participants 

read the following explanation about combinations of properties that could be written 

down: 

"Because you know that a THOG has either the shape I have written down or the colour I 
have written down, but not both, it follows that either the colour I have written down is 
black or the shape I have written down is triangle, but not both. Further, because there are 
only two shapes, it follows that if the colour I have written down is black, then the shape I 
have written down is circle; and because there are only two colours, it follows that if the 
shape I have written down is triangle, the colour I have written down is white. There are, 
therefore, two possible combinations that I could have written down: Triangle and White, 
or Circle and Black." 

Participants are then requested to respond to the following four questions: 






111 



"First, suppose I wrote down Triangle and White. In this case, which of the designs can 
be a THOG? And in this case, which of the designs cannot be a THOG? Second, suppose 
that I wrote down Circle and Black. In this case, which of the designs can be a THOG? 
And in this case, which of the designs cannot be a THOG?" 

Participants are then asked to classify the designs. 

Wason and Brooks (1979) 

Standard THOG (Exp. 1. 2) 

"In front of you are four designs: Blue Diamond, Red Diamond, Blue Circle and 
Red Circle. (Designs are shown.) You are to assume that I have written down one of the 
colors (blue or red) and one of the shapes (diamond or circle). Now read the following 
rule carefully: If, and only if, any of the designs includes either the color I have written 
down, or the shape I have written down, but not both, then it is called a THOG. I will tell 
you that the Blue Diamond is a THOG. Each of the designs can now be classified into 
one of the following categories: A) Definitely is a THOG, B) Insufficient information to 
decide. C) Definitely is not a THOG." 






APPENDIX B: MATERIALS 



113 



Experiments la and lb 

Standard THOG without Question 1 . three-choice instruction 

At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. You are to assume that I have written down one 
of the colors (white or black) and one of the shapes (diamond or circle). Now 
read the following rule carefully: If, and only if, any of the designs includes either 
the color I have written down, or the shape I have written down, but not both, 
then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Each of the designs can now be classified into one of the following categories: 
definitely a THOG, definitely not a THOG, or insufficient information to decide. 
Indicate your answer for each design in the space next to it. 




THOG 





TNQ1_3CHo1 



114 



THOG without Question 1 . two-choice instruction 



At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. You are to assume that I have written down one 
of the colors (white or black) and one of the shapes (diamond or circle). Now 
read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Each of the designs can now be classified into one of the following categories: 
definitely a THOG or definitely not a THOG. Indicate your answer for each 
design in the space next to it. 




THOG 





TNQ1 2CHo1 









115 



THOG with Question 1 , three-choice instruction (Experiment lb only) 



At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. You are to assume that I have written down one 
of the colors (white or black) and one of the shapes (diamond or circle). Now 
read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

(1) Knowing for sure that the Black Diamond is a THOG, your first task is to 
indicate which color and shape combinations(s) I could have written down. 



(2) Each of the designs can now be classified into one of the following 
categories: definitely a THOG, definitely not a THOG, or insufficient information 
to decide. Indicate your answer for each design in the space next to it. 



THOG 






TQ1 3CHo1 



THOG with Question 1, two-choice instruction (Experiment lb only) 



116 



At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. You are to assume that I have written down one 
of the colors (white or black) and one of the shapes (diamond or circle). Now 
read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

(1) Knowing for sure that the Black Diamond is a THOG, your first task is to 
indicate which color and shape combinations(s) I could have written down. 



(2) Each of the designs can now be classified into one of the following 
categories: definitely a THOG or definitely not a THOG. Indicate your answer 
for each design in the space next to it. 



THOG 






TQ1 2CHo1 



117 

SARS without Question 1 , three-choice instruction 

At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. 

I have defined one of these designs as a SARS. You do not know which design 
this is, but you do know that a design is a THOG if it has either the color or the 
shape of the SARS, but not both. 

I will tell you that the Black Diamond is a THOG. 

Each of the designs can now be classified into one of the following categories: 
definitely a THOG, definitely not a THOG, or insufficient information to decide. 
Indicate your answer for each design in the space next to it. 




THOG 





SNQ1 3CHo1 






118 
SARS without Question 1 , two-choice instruction 

At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. 

I have defined one of these designs as a SARS. You do not know which design 
this is, but you do know that a design is a THOG if it has either the color or the 
shape of the SARS, but not both. 

I will tell you that the Black Diamond is a THOG. 

Each of the designs can now be classified into one of the following categories: 
definitely a THOG or definitely not a THOG. Indicate your answer for each 
design in the space next to it 




THOG 





SNQ1 2CHo1 



119 



SARS with Question 1, three-choice instruction 

At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. 

I have defined one of these designs as a SARS. You do not know which design 
this is, but you do know that a design is a THOG if it has either the color or the 
shape of the SARS, but not both. 

(1) Knowing for sure that the Black Diamond is a THOG, your first task is to 
indicate which design(s) could be the SARS. 



(2) Each of the designs can now be classified into one of the following 
categories: definitely a THOG, definitely not a THOG, or insufficient information 
to decide. Indicate your answer for each design in the space next to it. 



THOG 






SQ1 3CHo1 



120 

SARS with Question 1, two-choice instruction 

At the bottom of the page are four designs: Black Diamond, White Diamond, 
Black Circle, and White Circle. 

I have defined one of these designs as a SARS. You do not know which design 
this is, but you do know that a design is a THOG if it has either the color or the 
shape of the SARS, but not both. 

(1) Knowing for sure that the Black Diamond is a THOG, your first task is to 
indicate which design(s) could be the SARS. 



(2) Each of the designs can now be classified into one of the following 
categories: definitely a THOG or definitely not a THOG. Indicate your answer 
for each design in the space next to it. 



THOG 






SQ1 2CHo1 






Experiments 2a and 2b 
Standard THOG without Question 1 , three-choice instruction 



121 



In front of you are four designs: Black Diamond, White Diamond, Black Circle, 
and White Circle. 







You are to assume that I have written down one of the colors (white or black) 
and one of the shapes (diamond or circle). Now read the following rule carefully: 
If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Each of the designs can now be classified into one of the following categories: 
definitely a THOG, definitely not a THOG, or insufficient information to decide. 
You have to indicate your answer for each design. 



GLTNQ1 3cno6 






122 



THOG without Question 1. "other" instruction 



In front of you are four designs: Black Diamond, White Diamond, Black Circle, 
and White Circle. 







You are to assume that I have written down one of the colors (white or black) 
and one of the shapes (diamond or circle). Now read the following rule carefully: 
If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Knowing for sure that the Black Diamond is a THOG, could you indicate whether, 
in addition to the Black Diamond, there are other THOGs? If there are other 
THOGs, you have to indicate which design or designs could be a THOG. 



GLTNQ1 RO06 



123 



Hvpotheses-THOG with Question 1. "other" instruction 

In front of you are four designs: Black Diamond, White Diamond, Black Circle, 
and White Circle. 







You are to assume that I have written down one of the colors (white or black) 
and one of the shapes (diamond or circle). Now read the following rule carefully: 
If, and only if, any of the designs includes either the color I have written down, or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Knowing for sure that the Black Diamond is a THOG, you have to indicate which 
combination or combinations of shape and color I could have written down. 



Could you also indicate whether, in addition to the Black Diamond, there are 
other THOGs? If there are other THOGs, you have to indicate which design or 
designs could be a THOG. 



GLTQ1 RO06 






124 



SARS with Question 1 . "other" instruction 

In front of you are four designs: Black Diamond, White Diamond, Black Circle, 
and White Circle. 







I have defined one of these designs as a SARS. You do not know which design 
this is. But you do know that a design is a THOG if it has either the color or the 
shape of the SARS, but not both. 

Knowing for sure that the Black Diamond is a THOG, you have to indicate which 
one or which ones, among the remaining designs, could be the SARS. 



Could you also indicate whether, in addition to the Black Diamond, there are 
other THOGs? If there are other THOGs, you have to indicate which design or 
designs could be a THOG. 



GLSQ1 RO06 



Experiment 3 125 

THOG wit hout Question 1. one-other instruction 

In front of you are four designs: Black Diamond, White Diamond Black Circle 
and White Circle. 







You are to assume that I have written down one of the colors (white or black) 

ff n on°H ne 2f ? e ap , e I ( d i am . ond or circle )- Now read the following rule carefully 
If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

In addition to the Black Diamond, one other design is a THOG Which other 
design is a THOG? 



THNQ1_10To6v 



126 



THOG with Question 1, one-other instruction 

In front of you are four designs: Black Diamond, White Diamond, Black Circle 
and White Circle. 







You are to assume that I have written down one of the colors (white or black) 
and one of the shapes (diamond or circle). Now read the following rule carefully 
If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Knowing for sure that the Black Diamond is a THOG, you have to indicate which 
combination or combinations of shape and color I could have written down. 



In addition to the Black Diamond, one other design is a THOG. Which other 
design is a THOG? 



TQ1 10THo6v 






127 



SARS without Question 1. one-other instruction 

In front of you are four designs: Black Diamond, White Diamond, Black Circle 
and White Circle. 







I have defined one of these designs as a SARS. You do not know which design 

ShLS St&KadcmK tha ^ a l esign is a TH0G if jt has either the color or the 
snape of the SARS, but not both. 

I will tell you that the Black Diamond is a THOG. 

In addition to the Black Diamond, one other design is a THOG Which other 
design is a THOG? 



SNQ1 10To6v 






128 



SARS wit h Question 1. one-other instruction 

ln 5?H2 y ~ U are four des '9 ns: Bla ck Diamond, White Diamond, Black Circle 
and White Circle. 







I have defined one of these designs as a SARS. You do not know which desian 

23 ofWK, Sf ST 8 " is a TH0G " i( has ei,her K " r ° s 

Know for sure that the Black Diamond is a THOG, you have to indicate which 
one or which ones, among the remaining designs, could be the SARS. 



In addition to the Black Diamond, one other design is a THOG Which other 
design is a THOG? 



SQ1_10To6v 



Experiment 4a 
Standard THOG without Question 1. three-choice instruction 



129 



In front of you are four designs: Black Diamond, White Diamond, Black Circle, and White 







You are to assume that I have written down one of the colors (white or black) and one of the 
shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or the 
shape I have written down, but not both, then it is called a THOG. 



will tell you that the Black Diamond is a THOG. 



^^° f i h ! designS can now be classified '"to one of the following categories' Definitely a 
THOG Definitely not a THOG, or Insufficient information to decide. Indicate your answer 
for each design in the space beneath it. 



TNQ14 4 



130 
Explanation THOG 

In front of you are four designs: Black Diamond, White Diamond, Black Circle, and White 







You are to assume that I have written down one of the colors (white or black) and one of the 
shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or the 
shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Consider all of the combinations that I might have written down: Diamond and Black Circle 
and Black, Diamond and White, and Circle and White. I could not have written down both 
Diamond and Black because the Black Diamond includes both of these features Similarly 
I could not have written down both Circle and White because the Black Diamond contains 

T* R. r ^n 6Se e ! tUre !' u But the remainin 9 combinations each include one of the features 
of a Black Diamond, and hence I could have written down either Circle and Black or 
Diamond and White. ' 

^tho firS TMn^ SSibility that ! Wr0te d0wn Circle and Black - ln this ^se, a Black Circle 
cannot be a THOG because ,t includes both features, and a White Diamond cannot be a 
THOG because it includes neither feature. But a White Circle is a THOG because it 
includes one of the features and not the other. Consider next the possibility that I wrote 
down Diamond and White. In this case, a Black Circle cannot be a THOG because? 
includes neither of the features, and a White Diamond cannot be a THOG because it 

S2S5 and" no, a ro«her ' ' ^ ** * * ™ G b «~" « ***" «» * ■» •» 

^~ 0f i h !. designs can now be classifie d into one of the following categories- Definitely a 
THOG, Definitely not a THOG, or Insufficient information to decide. I ndicate your answer 
for each design in the space beneath it. 



OBE4 4 



131 
Pretest THOG (grouped-questionst 

In front of you are four designs: Black Diamond, White Diamond, Black Circle, and White 
Circle. 







You are to assume that I have written down one of the colors (white or black) and one of the 
shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down, or the 
shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is Black 
or the shape I have written down is Diamond, but not both. Further, because there are only 
two shapes, it follows that if the color I have written down is Black, then the shape I have 
written down is Circle; and because there are only two colors, it follows that if the shape I 
have written down is Diamond, then the color I have written down is White. There are 
therefore, two possible combinations that I could have written down: Diamond and White 
or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the designs can 
be a THOG? And in this case, which of the designs cannot be a THOG? Second 
suppose that I wrote down Circle and Black. In this case, which of the designs can be a 
THOG? And in this case, which of the designs cannot be a THOG? 



(2) Each of the designs can now be classified into one of the following categories' Definitely 
a THOG, Definitely not a THOG, or Insufficient information to decide. Indicate your 
answer for each design in the space beneath it. 



OBP4 4 



132 
Pretest THOG (split-q uestions) 

In front of you are four designs: Black Diamond, White Diamond, Black Circle and White 
Circle. 







You are to assume that I have written down one of the colors (white or black) and one of the 
shapes (diamond or circle). Now read the following rule carefully: 

If, and only If, any of the designs Includes either the color I have written down or the 
shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is Black 
or the shape I have written down is Diamond, but not both. Further, because there are onlv 
two shapes, it follows that if the color I have written down is Black, then the shape I have 
written down is Circle; and because there are only two colors, it follows that if the shape I 
have written down is Diamond, then the color I have written down is White There are 
therefore, two possible combinations that I could have written down: Diamond and White 
or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the designs can 
be a THOG? 



And in this case, which of the designs cannot be a THOG? 

(2) Second, suppose that I wrote down Circle and Black. In this case, which of the designs 
can be a THOG? 

And in this case, which of the designs cannot be a THOG? 



(3) fSJ th n SK* Can n0W be classified int0 one of the following categories: Definitely 
a THOG Definitely not a THOG, or Insufficient information to decide. Indicate you 
answer for each design in the space beneath it. 

OBPs4 4 



Experiment 4b 133 

Pretest TH OG (split-questions), with Q ualifier 

!?,u r ? n L° f y° U are four des '9 ns: Black Diamond, White Diamond, Black Circle and 
White Circle. 







You are to assume that I have written down one of the colors (white or black) and one of 
the shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is 
Black or the shape I have written down is Diamond, but not both. Further because 
there are only two shapes, it follows that if the color I have written down is Black then 
he shape I have written down is Circle; and because there are only two colors, it follows 
hatiftheshape I have written down is Diamond, then the color I have written down is 
White. There are, therefore, two possible combinations that I could have written down- 
Diamond and White, or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the designs 
other than the Black Diamond, can be a THOG? ' 

And in this case, which of the designs cannot be a THOG? 



(2) Second, suppose that I wrote down Circle and Black. In this case which of the 
designs, other than the Black Diamond, can be a THOG? 



And in this case, which of the designs cannot be a THOG? 



(3) Each of the designs can now be classified into one of the following categories- 
Definitely a THOG, Definitely not a THOG, or Insufficient information to decide 
Indicate your answer for each design in the space beneath it. 

OBPs3 4 



134 
Pretest THOG (split-que stions), with Qualifier plus Reminder 

In front of you are four designs: Black Diamond, White Diamond, Black Circle and 
White Circle. 







You are to assume that I have written down one of the colors (white or black) and one of 
the shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is 
Black or the shape I have written down is Diamond, but not both. Further because 
there are only two shapes, it follows that if the color I have written down is Black then 
the shape I have written down is Circle; and because there are only two colors it follows 
that if the shape I have written down is Diamond, then the color I have written down is 
White. There are, therefore, two possible combinations that I could have written down- 
Diamond and White, or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the designs 
other than the Black Diamond, can be a THOG? 

And in this case, which of the designs cannot be a THOG? 



(2) Second, suppose that I wrote down Circle and Black. In this case, which of the 
designs, other than the Black Diamond, can be a THOG? 



And in this case, which of the designs cannot be a THOG? 



Reminder: Given the rule and the fact that the Black Diamond is a THOG I could not | 
have written down Diamond and Black without creating a contradiction. 



(3) Each of the designs can now be classified into one of the following categories 
Definitely a THOG, Definitely not a THOG, or Insufficient information to decide 
Indicate your answer for each design in the space beneath it. 

OBPs3R 



Pre ,es, THOO Mifca^toa M o„ ali <W fr aa^g 

In front of you are four designs: Black Diamond, White Diamond, Black Circle and 
White Circle. 







You are to assume that I have written down one of the colors (white or black) and one of 
the shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is 
Black or the shape I have written down is Diamond, but not both. Further because 
there are only two shapes, it follows that if the color I have written down is Black then 
the shape I have written down is Circle; and because there are only two colors it follows 

,*,?•* -?u Spe have Written down is Diamond - th en the color I have written down is 
White. There are, therefore, two possible combinations that I could have written down- 
Diamond and White, or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the designs 
other than the Black Diamond, is definitely a THOG? ' 



And in this case, which of the designs is definitely not a THOG? 



(2) Second, suppose that I wrote down Circle and Black. In this case which of the 
designs, other than the Black Diamond, is definitely a THOG? 



And in this case, which of the designs is definitely not a THOG? 



(3) Each of the designs can now be classified into one of the following categories- 
Definitely a THOG, Definitely not a THOG, or Insufficient information to decide 
Indicate your answer for each design in the space beneath it. 

OBPs3D 4 






136 
Pretest THOG (split-quest ions), with Qualifier plus "Definite" plus Reminder 

In front of you are four designs: Black Diamond, White Diamond, Black Circle and 
White Circle. 







You are to assume that I have written down one of the colors (white or black) and one of 
the shapes (diamond or circle). Now read the following rule carefully: 

If, and only if, any of the designs includes either the color I have written down or 
the shape I have written down, but not both, then it is called a THOG. 

I will tell you that the Black Diamond is a THOG. 

Because you know that a THOG has either the shape I have written down or the color I 
have written down, but not both, it follows that either the color I have written down is 
Black or the shape I have written down is Diamond, but not both. Further because 
there are only two shapes, it follows that if the color I have written down is Black then 
the shape I have written down is Circle; and because there are only two colors it follows 
that if the shape I have written down is Diamond, then the color I have written down is 
White. There are, therefore, two possible combinations that I could have written down- 
Diamond and White, or Circle and Black. 

(1) First, suppose I wrote down Diamond and White. In this case, which of the desiqns 
other than the Black Diamond, is definitely a THOG? ' 

And in this case, which of the designs is definitely not a THOG? 



(2) Second, suppose that I wrote down Circle and Black. In this case which of the 
designs, other than the Black Diamond, is definitely a THOG? 

And in this case, which of the designs is definitely not a THOG? 



Reminder: Given the rule and the fact that the Black Diamond is a THOG I could not I 
have written down Diamond and Black without creating a contradic tion. — 



(3) Each of the designs can now be classified into one of the following categories- 
Definitely a THOG, Definitely not a THOG, or Insufficient information to decide 
Indicate your answer for each design in the space beneath it. 

OBPS3DR 4 



BIOGRAPHICAL SKETCH 
The author began her undergraduate education at Barnard College. She then spent 
eight years in the field of consumer research in New York City and Los Angeles, 
designing, implementing and analyzing a wide range of surveys and focused group 
interviews for major manufacturers, airlines and political campaigns. While in New York, 
she continued her education in the evenings at Pace College. After relocating to Florida, 
she served as an administrator in an industrial organization, then formed her own small 
transportation company. 

In May, 1993, she received her B. S. in psychology from the University of Central 
Florida in Orlando. Her research there focused on confirmatory bias in decision-making. 
During the summer of 1993, she worked as a technical writer at the Institute for 
Simulation and Training in Orlando. The author entered the graduate program in 
cognitive and sensory processes at the University of Florida in August 1993. Focusing on 
the psychology of writing, she conducted research comparing different methodologies and 
evaluating different forms of quality assessment. In her master's thesis, she explored 
expertise-related differences in computer-based writing. After receiving her M. S. in 
August 1995, she embarked on an investigation of reasoning abilities, culminating in this 
dissertation. 



137 



I certify that I have read this study and that in my opinion it conforms to 
acceptable standards of scholarly presentation and is fully acceptable, in spepe and 
quality, as a dissertation for the degree of Doctor of Philosophy. 




Professor of Psychology 



I certify that I have read this study and that in my opinion it conforms to 
acceptable standards of scholarly presentation and is fully acceptable, in scope and 
quality, as a dissertation for the degree of Doctor of Philosophy. 




k 



Shari Ellis 

Assistant Professor of Psychology 



I certify that I have read this study and that in my opinion it conforms to 
acceptable standards of scholarly presentation and is fully acceptable, in scope and 
quality, as a dissertation for the degree of Doctor of Philosopl 




Tra Fischler 

Professor of Psychology 

I certify that I have read this study and that in my opinion it conforms to 
acceptable standards of scholarly presentation and is fully acceptable, in scope and 
quality, as a dissertation for the degree of Doctor of Philosophy. 




Chris Janjs^ewskr 

Associate Professor of Marketing 

I certify that I have read this study and that in my opinion it conforms to 
acceptable standards of scholarly presentation and is fully acceptable, in scope and 
quality, as a dissertation for the degree of Doctor of Philosophy. 




*atricia H. Miller 
Professor of Psychology 



This dissertation was submitted to the Graduate Faculty of the Department of 
Psychology in the College of Liberal Arts and Sciences and to the Graduate School and 
was accepted as partial fulfillment of the requirements for the degree of Doctor of 
Philosophy. 

August 1998 



Dean, Graduate School